[Ovmsdev] OVMS Poller module/singleton
Michael Balzer
dexter at expeedo.de
Sun May 19 12:50:59 HKT 2024
A builtin web UI for this seems a bit over the top. Builtin web config
pages should focus on user features, this is clearly a feature only
needed during/for the development/extension of a vehicle adapter.
Development features in the web UI are confusing for end users.
If persistent enabling/disabling is done by a simple config command
(e.g. "config set can poller.trace on"), that's also doable by users.
Regards,
Michael
Am 19.05.24 um 02:06 schrieb Michael Geddes:
> I was so focused on how I calculated the value that I totally
> missed that ‰ would be a better description. I could also use the
> system 'Ratio' unit... so % or ‰.
>
> I'll make space to put 'Avg' on the row. Was trying to limit the
> width for output on a mobile. I agree it would make it easier to
> understand.
> Totals also makes sense.
>
> Should I make this a configuration that can be set on the web-page?
> I'd probably use a configuration change notification so that the very
> bit setting is sync'd with the 'configuration' value.
>
> //.ichael
>
> On Sat, 18 May 2024, 14:05 Michael Balzer, <dexter at expeedo.de> wrote:
>
>> I'm not sure whether the 'max' should be the maximum of the
>> smoothed value.. or the maximum of the raw value.
>
> It should normally be the maximum of the raw value I think, the
> maximum of the smoothed value cannot tell about how bad the
> processing of an ID can become.
>
> The naming in the table is a bit confusing I think. (besides: I've
> never seen "ave" as the abbreviation for average)
>
> If I understand you correctly, "time ms per s" is the time share
> in per mille, so something in that direction would be more clear,
> and "length ms" would then be "time [ms]".
>
> The totals for all averages in the table foot would also be nice.
>
> Maybe "Ave" (or avg?) also should be placed on the left, as the
> "peak" label now suggests being the peak of the average.
>
> Btw, keep in mind, not all "edge" users / testers are developers
> (e.g. the Twizy driver I'm in contact with), collecting stats
> feedback for vehicles from testers should be straight forward.
> Maybe add a data/history record, sent automatically on every
> drive/charge stop when the poller tracing is on?
>
> Regards,
> Michael
>
>
> Am 18.05.24 um 02:28 schrieb Michael Geddes:
>> You did say max/pead value. I also halved the N for both.
>> I'm not sure whether the 'max' should be the maximum of the
>> smoothed value.. or the maximum of the raw value.
>> This is currently the raw-value maximum.
>> The problem is that the middle column is the maximum of the {{sum
>> over 10s} / (10*1000,000)
>> I could easily change the 'period' to 1s and see how that goes..
>> was just trying to reduce the larger calculations.
>>
>>
>> Usage: poller [pause|resume|status|times|trace]
>>
>> OVMS# poller time status
>> Poller timing is: on
>> Type | Count | Ave time | Ave length
>> | per s | ms per s | ms
>> -------------+----------+-----------+-----------
>> Poll:PRI | 1.00| 0.559| 0.543
>> peak | | 0.663| 1.528
>> -------------+----------+-----------+-----------
>> Poll:SRX | 0.08| 0.009| 0.038
>> peak | | 0.068| 0.146
>> -------------+----------+-----------+-----------
>> CAN1 RX[778] | 0.11| 0.061| 0.280
>> peak | | 0.458| 1.046
>> -------------+----------+-----------+-----------
>> CAN1 RX[7a8] | 0.04| 0.024| 0.124
>> peak | | 0.160| 0.615
>> -------------+----------+-----------+-----------
>> CAN1 TX[770] | 0.05| 0.004| 0.016
>> peak | | 0.022| 0.102
>> -------------+----------+-----------+-----------
>> CAN1 TX[7a0] | 0.02| 0.002| 0.011
>> peak | | 0.010| 0.098
>> -------------+----------+-----------+-----------
>> CAN1 TX[7b3] | 0.01| 0.001| 0.006
>> peak | | 0.000| 0.099
>> -------------+----------+-----------+-----------
>> CAN1 TX[7e2] | 0.02| 0.002| 0.011
>> peak | | 0.010| 0.099
>> -------------+----------+-----------+-----------
>> CAN1 TX[7e4] | 0.08| 0.008| 0.048
>> peak | | 0.049| 0.107
>> -------------+----------+-----------+-----------
>> Cmd:State | 0.00| 0.000| 0.005
>> peak | | 0.000| 0.094
>>
>> On Fri, 17 May 2024 at 15:26, Michael Geddes
>> <frog at bunyip.wheelycreek.net> wrote:
>>
>> This is what I have now.
>> The one on the end is the one MIchael B was after using an N
>> of 32. (up for discussion).
>>
>> The middle is the time spent in that even t per second. It
>> accumulates times (in microseconds), and then every 10s it
>> stores it as smoothed (N=16) value.
>> The Count is similar (except that we store a value of '100'
>> as 1 event so it can be still integers and has 2 decimal places).
>>
>> Every received poll does a 64bit difference to 32bit (for
>> the elapsed time) and 64bit comparison (for end-of-period).
>> It also does 1x 32bit smoothing and 2x 32bit adds.
>>
>> Then at the end of a 10s period, it will do a 64bit add to
>> get the next end-of-period value, as well as the 2x 32bit
>> smoothing calcs.
>>
>> This is from the Ioniq 5 so not any big values yet. You can
>> certainly see how insignificant the TX callbacks are.
>> I'll leave it on for when the car is moving and gets some
>> faster polling.
>>
>> OVMS# poll time status
>> Poller timing is: on
>> Type | Count | Ave time | Ave Length
>> | per s | ms per s | ms
>> -------------+----------+-----------+-----------
>> Poll:PRI | 1.00| 0.540| 0.539
>> Poll:SRX | 0.03| 0.004| 0.017
>> CAN1 RX[778] | 0.06| 0.042| 0.175
>> CAN1 TX[770] | 0.04| 0.002| 0.008
>> Cmd:State | 0.01| 0.001| 0.005
>>
>> ----------------------8<--------------------------------
>>
>> Nice smoothing class (forces N as a power of 2):
>> constexpr unsigned floorlog2(unsigned x)
>> {
>> return x == 1 ? 0 : 1+floorlog2(x >> 1);
>> }
>> /* Maintain a smoothed average using shifts for division.
>> * T should be an integer type
>> * N needs to be a power of 2
>> */
>> template <typename T, unsigned N>
>> class average_util_t
>> {
>> private:
>> T m_ave;
>> public:
>> average_util_t() : m_ave(0) {}
>> static const uint8_t _BITS = floorlog2(N);
>> void add( T val)
>> {
>> static_assert(N == (1 << _BITS), "N must be a power
>> of 2");
>> m_ave = (((N-1) * m_ave) + val) >> _BITS;
>> }
>> T get() { return m_ave; }
>> operator T() { return m_ave; }
>> };
>>
>>
>>
>> On Thu, 16 May 2024 at 10:29, Michael Geddes
>> <frog at bunyip.wheelycreek.net> wrote:
>>
>> Thanks Michael,
>>
>> My calculations give me ((2^32)-1) / (1000*1000*3600) =
>> only 1.2 hours of processing time in 32bit. The initial
>> subtraction is 64bit anyway and I can't see a
>> further 64bit addition being a problem. I have the
>> calculations being performed in doubles at print-out
>> where performance is not really an issue anyway. (Though
>> apparently doing 64 bit division is worse than floating
>> point).
>>
>> In addition
>> * I currently have this being able to be turned on and
>> off and reset manually (only do it when required).
>>
>> * For the lower volume commands, the smoothed average is
>> not going to be useful - the count is more interesting
>> for different reasons.
>>
>> * The total time is quite useful. Ie a high average time
>> doesn't matter if the count is low. The things that are
>> affecting performance are stuff with high total time.
>> Stuff which is happening 100 times a second needs to be a
>> much lower average than once a second.
>>
>> * A measure like 'time per minute/second' and possibly
>> count per minute/seconds as a smoothed average would
>> potentially be more useful. (or in addition?)
>> I think we could do _that_ in a reasonably efficient
>> manner using a 64 bit 'last measured time', a 32 bit
>> accumulated value and the stored 32 bit rolling average.
>> It would boils down to some iterative (integer) sums and
>> multiplications plus a divide by n ^ (time periods
>> passed) - which is a shift - and which can be optimised
>> to '0' if 'time-periods-passed' is more than
>> 32/(bits-per-n) - effectively limiting the number of
>> iterations.
>> The one issue I can see is that we need to calculate
>> 'number of time-periods passed' which is a 64 bit
>> subtraction followed by a 32 bit division (not
>> optimisable to a simple shift).
>>
>> * I'm also happy to keep a rolling (32bit) average time.
>> Even if you assume averages in the 100ms, 32bit is
>> going to happily support an N of 64 or even 128.
>> Am I right in thinking that the choice of N is highly
>> dependent on frequency. For things happening 100 times
>> per second, you might want an N like 128.. where things
>> happening once per
>> second, you might want an N of 4 or 8. The other
>> things we keep track of in this manner we have a better
>> idea of the frequency of the thing.
>>
>> How about we have (per record type):
>> * total count (since last reset?) (32 bit)
>> * smoothed average of time per instance (32 bit)
>>
>> * ?xx? total accumulated time since last reset (64bit)
>> ?? <-- with the below stats this is much less useful
>>
>> * last-measured-time (64 bit)
>> * accumulated count since last time-period (16bit - but
>> maybe 32bit anyway for byte alignment?)
>> * smoothed average of count per time-period (32bit)
>> * accumulated time since last time-period (32bit)
>> * smoothed average of time per time-period (32bit)
>>
>> It's possible to keep the
>>
>> Is this going to be too much per record type? The number
>> of 'records' we are keeping is quite low (so 10 to 20
>> maybe) - so it's not a huge memory burden.
>>
>> Thoughts?
>>
>> //.ichael
>> On Thu, 16 May 2024 at 03:09, Michael Balzer via OvmsDev
>> <ovmsdev at lists.openvehicles.com> wrote:
>>
>> esp_timer_get_time() is the right choice for
>> precision timing.
>>
>> I'd say uint32 is enough though, even if counting
>> microseconds that can hold a total of more than 71
>> hours of actual processing time. uint64 has a
>> significant performance penalty, although I don't
>> recall the overhead for simple additions.
>>
>> Also & more important, the average wouldn't be my
>> main focus, but the maximum processing time seen per
>> ID, which seems to be missing in your draft.
>>
>> Second thought on the average… the exact overall
>> average really has a minor meaning, I'd rather see
>> the current average, adapting to the current mode of
>> operation (drive/charge/…). I suggest feeding the
>> measurements to a low pass filter to get the smoothed
>> average of the last n measurements. Pattern:
>>
>> runavg = ((N-1) * runavg + newval) / N
>>
>> By using a low power of 2 for N (e.g. 8 or 16), you
>> can replace the division by a simple bit shift, and
>> have enough headroom to use 32 bit integers.
>>
>> Regards,
>> Michael
>>
>>
>> Am 15.05.24 um 06:51 schrieb Michael Geddes via OvmsDev:
>>> Formatting aside, I have implemented what I think
>>> Michael B was suggesting. This is a sample run on
>>> the Ioniq 5 (which doesn't have unsolicited RX events).
>>>
>>> This uses the call esp_timer_get_time() got get a
>>> 64bit *microseconds* since started value - and works
>>> out the time to execute that way. It's looking at
>>> absolute time and not time in the Task - so other
>>> things going on at the same time in other tasks will
>>> have an effect. (The normal tick count doesn't have
>>> nearly enough resolution to be useful - any other
>>> ideas on measurement?) I've got total accumulated
>>> time displaying in seconds and the average in
>>> milliseconds currently - but I can change that easy
>>> enough.
>>> The cumulative time is stored as uint64_t which will
>>> be plenty, as 32bit wouldn't be nearly enough.
>>>
>>> OVMS# *poller time on*
>>> Poller timing is now on
>>>
>>> OVMS# *poller time status*
>>> Poller timing is: on
>>> Poll [PRI] : n=390 tot=0.2s ave=0.586ms
>>> Poll [SRX] : n=316 tot=0.1s ave=0.196ms
>>> CAN1 RX[0778] : n=382 tot=0.2s ave=0.615ms
>>> CAN1 RX[07a8] : n=48 tot=0.0s ave=0.510ms
>>> CAN1 RX[07bb] : n=162 tot=0.1s ave=0.519ms
>>> CAN1 RX[07ce] : n=33 tot=0.0s ave=0.469ms
>>> CAN1 RX[07ea] : n=408 tot=0.2s ave=0.467ms
>>> CAN1 RX[07ec] : n=486 tot=0.2s ave=0.477ms
>>> CAN3 RX[07df] : n=769 tot=0.2s ave=0.261ms
>>> CAN1 TX[0770] : n=191 tot=0.0s ave=0.054ms
>>> CAN1 TX[07a0] : n=16 tot=0.0s ave=0.047ms
>>> CAN1 TX[07b3] : n=31 tot=0.0s ave=0.069ms
>>> CAN1 TX[07c6] : n=11 tot=0.0s ave=0.044ms
>>> CAN1 TX[07e2] : n=82 tot=0.0s ave=0.067ms
>>> CAN1 TX[07e4] : n=54 tot=0.0s ave=0.044ms
>>> Set State : n=7 tot=0.0s ave=0.104ms
>>>
>>> This is probably going to be quite useful in
>>> general! The TX call-backs don't seem to be
>>> significant here. (oh, I should probably implement a
>>> reset of the values too).
>>>
>>> //.ichael
>>>
>>> On Sun, 12 May 2024 at 22:58, Michael Geddes
>>> <frog at bunyip.wheelycreek.net> wrote:
>>>
>>> Yeah - I certainly wasn't going to put a hard
>>> limit. Just a log above a certain time, that
>>> being said, the idea of just collecting stats
>>> (being able to turn it on via a "poller timer"
>>> set of commands) would be much more useful.
>>> I'll look into that.
>>>
>>> Average time is probably a good stat - and
>>> certainly what we care about.
>>>
>>> I actually am hopeful that those couple of
>>> things I did might help reduce that average time
>>> quite a bit (that short-cutting the isotp
>>> protocol handling especially).
>>>
>>> That p/r with logging changes might help reduce
>>> the unproductive log time further, but
>>> also makes it possible to turn on the poller
>>> logging without the RX task logs kicking in.
>>>
>>> //.ichael
>>>
>>>
>>> On Sun, 12 May 2024 at 22:29, Michael Balzer via
>>> OvmsDev <ovmsdev at lists.openvehicles.com> wrote:
>>>
>>> Warning / gathering debug statistics about
>>> slow processing can be helpful, but there
>>> must not be a hard limit. Frame/poll
>>> response processing may need disk or network
>>> I/O, and the vehicle task may be starving
>>> from punctual high loads on higher priority
>>> tasks (e.g. networking) or by needing to
>>> wait for some semaphore -- that's outside
>>> the application's control, and must not lead
>>> to termination/recreation of the task (in
>>> case you're heading towards that direction).
>>>
>>> I have no idea how much processing time the
>>> current vehicles actually need in their
>>> respective worst cases. Your draft is
>>> probably too lax, poll responses and frames
>>> normally need to be processed much faster.
>>> I'd say 10 ms is already too slow, but any
>>> wait for a queue/semaphore will already mean
>>> at least 10 ms (FreeRTOS tick). Probably
>>> best to begin with just collecting stats.
>>>
>>> Btw, to help in narrowing down the actual
>>> problem case, the profiler could collect max
>>> times per RX message ID.
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 12.05.24 um 10:41 schrieb Michael Geddes:
>>>> I have a question for Michael B (or
>>>> whoever) - I have a commit lined up that
>>>> would add a bit of a time check to the
>>>> poller loop. What do we expect the
>>>> maximum time to execute a poller loop
>>>> command should be?
>>>> This is a rough idea (in ms) I have.. based
>>>> on nothing much really, so any ideas would
>>>> be appreciated:
>>>> int
>>>> TardyMaxTime_ms(OvmsPoller::OvmsPollEntryType
>>>> entry_type)
>>>> {
>>>> switch (entry_type)
>>>> {
>>>> case
>>>> OvmsPoller::OvmsPollEntryType::Poll: return 80;
>>>> case
>>>> OvmsPoller::OvmsPollEntryType::FrameRx:
>>>> return 30;
>>>> case
>>>> OvmsPoller::OvmsPollEntryType::FrameTx:
>>>> return 20;
>>>> case
>>>> OvmsPoller::OvmsPollEntryType::Command:
>>>> return 10;
>>>> case
>>>> OvmsPoller::OvmsPollEntryType::PollState:
>>>> return 15;
>>>> default: return 80;
>>>> }
>>>> }
>>>>
>>>> //.ichael
>>>>
>>>> On Mon, 6 May 2024 at 07:45, Michael Geddes
>>>> <frog at bunyip.wheelycreek.net> wrote:
>>>>
>>>> I realise that I was only using the
>>>> standard cable to test - which
>>>> probably is not sufficient - I haven't
>>>> looked closely at how the Leaf OBD to
>>>> Db9 cable is different from standard.
>>>>
>>>> Ah, my bad out the queue length. We
>>>> are definitely queueing more messages
>>>> though. From my log of when the
>>>> overflow happened, the poller was in
>>>> state 0 which means OFF - ie nothing
>>>> was being sent!!
>>>>
>>>> I'll look at the TX message thing - opt
>>>> in sounds good - though it shouldn't be
>>>> playing that much of a part here as the
>>>> TXs are infrequent in this case (or
>>>> zero when the leaf is off or driving) -
>>>> On the ioniq 5 when I'm using the HUD -
>>>> I'm polling quite frequently - multiple
>>>> times per second and that seems to be
>>>> fine!.
>>>>
>>>> I did find an issue with the
>>>> throttling .. but it would still
>>>> mostly apply the throttling where it
>>>> matters, so again, it shouldn't be the
>>>> problem (also, we aren't transmitting
>>>> in the leaf case).
>>>>
>>>> The change I made to the logging of RX
>>>> messages showed how many in a row were
>>>> dropped... and it was mostly 1 only in
>>>> a run - which means even if it is a
>>>> short time between - that means that
>>>> the drops are being interleaved by at
>>>> least one success!
>>>>
>>>> Sooo.. I'm still wondering what is
>>>> going on. Some things I'm going to try:
>>>>
>>>> * If the number of messages on the Can
>>>> bus (coming in through RX) means that
>>>> the queue is slowly getting longer and
>>>> not quite catching up, then making the
>>>> queue longer will help it last
>>>> longer... but only pushes the problem
>>>> down the road.
>>>> - Add 'current queue length' to the
>>>> poller status information to see if
>>>> this is indeed the case?
>>>> - Add some kind of alert when the
>>>> queue reaches a % full?
>>>> * Once you start overflowing and
>>>> getting overflow log messages, I wonder
>>>> if this is then contributing to the
>>>> problem.
>>>> - Push the overflow logging into
>>>> Poller Task which can look at how many
>>>> drops occurred since last received item.
>>>> * Split up the flags for the poller
>>>> messages into 2:
>>>> - Messages that are/could be
>>>> happening in the TX/RX tasks
>>>> - Other noisy messages that always
>>>> happen in the poller task.
>>>>
>>>> Thoughts on what else we might measure
>>>> to figure out what is going on?
>>>>
>>>> //.ichael
>>>>
>>>>
>>>> On Sun, 5 May 2024, 19:29 Michael
>>>> Balzer via OvmsDev,
>>>> <ovmsdev at lists.openvehicles.com> wrote:
>>>>
>>>> Michael,
>>>>
>>>> the queue size isn't in bytes, it's
>>>> in messages:
>>>>
>>>>> * @param uxQueueLength The
>>>>> maximum number of items that the
>>>>> queue can contain.
>>>>> *
>>>>> * @param uxItemSize The number of
>>>>> bytes each item in the queue will
>>>>> require.
>>>>
>>>> Also, from the time stamps in
>>>> Dereks log excerpt, there were
>>>> quite some dropped frames in that
>>>> time window -- at least 23 frames
>>>> in 40 ms, that's bad.
>>>>
>>>> Queue sizes are currently:
>>>>
>>>> CONFIG_OVMS_HW_CAN_RX_QUEUE_SIZE=60
>>>> CONFIG_OVMS_VEHICLE_CAN_RX_QUEUE_SIZE=60
>>>>
>>>> The new poller now channels all TX
>>>> callbacks through the task queue
>>>> additionally to RX and commands. So
>>>> setting the queue size to be larger
>>>> than the CAN RX queue size seems
>>>> appropriate.
>>>>
>>>> Nevertheless, an overflow with more
>>>> than 60 waiting messages still
>>>> indicates some too long processing
>>>> time in the vehicle task.
>>>>
>>>> TX callbacks previously were done
>>>> directly in the CAN context, and no
>>>> current vehicle overrides the empty
>>>> default handler, so this imposed
>>>> almost no additional overhead. By
>>>> requiring a queue entry for each TX
>>>> callback, this feature now has a
>>>> potentially high impact for all
>>>> vehicles. If passing these to the
>>>> task is actually necessary, it
>>>> needs to become an opt-in feature,
>>>> so only vehicles subscribing to the
>>>> callback actually need to cope with
>>>> that additional load & potential
>>>> processing delays involved.
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20240519/dcc9ef7f/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20240519/dcc9ef7f/attachment-0001.sig>
More information about the OvmsDev
mailing list