[Ovmsdev] OVMS Poller module/singleton

Wed May 15 12:51:40 HKT 2024

Formatting aside, I have implemented what I think Michael B was
suggesting.  This is a sample run on the Ioniq 5 (which doesn't have
unsolicited RX events).

This uses the call esp_timer_get_time() got get a 64bit *microseconds*
since started value - and works out the time to execute that way. It's
looking at absolute time and not time in the Task - so other things going
on at the same time in other tasks will have an effect.  (The normal tick
count doesn't have nearly enough resolution to be useful - any other ideas
on measurement?) I've got total accumulated time displaying in seconds and
the average in milliseconds currently - but I can change that easy enough.
The cumulative time is stored as uint64_t which will be plenty, as 32bit
wouldn't be nearly enough.

OVMS# *poller time on*
Poller timing is now on

OVMS# *poller time status*
Poller timing is: on
Poll [PRI]               : n=390 tot=0.2s ave=0.586ms
Poll [SRX]               : n=316 tot=0.1s ave=0.196ms
CAN1 RX[0778]            : n=382 tot=0.2s ave=0.615ms
CAN1 RX[07a8]            : n=48 tot=0.0s ave=0.510ms
CAN1 RX[07bb]            : n=162 tot=0.1s ave=0.519ms
CAN1 RX[07ce]            : n=33 tot=0.0s ave=0.469ms
CAN1 RX[07ea]            : n=408 tot=0.2s ave=0.467ms
CAN1 RX[07ec]            : n=486 tot=0.2s ave=0.477ms
CAN3 RX[07df]            : n=769 tot=0.2s ave=0.261ms
CAN1 TX[0770]            : n=191 tot=0.0s ave=0.054ms
CAN1 TX[07a0]            : n=16 tot=0.0s ave=0.047ms
CAN1 TX[07b3]            : n=31 tot=0.0s ave=0.069ms
CAN1 TX[07c6]            : n=11 tot=0.0s ave=0.044ms
CAN1 TX[07e2]            : n=82 tot=0.0s ave=0.067ms
CAN1 TX[07e4]            : n=54 tot=0.0s ave=0.044ms
Set State                : n=7 tot=0.0s ave=0.104ms

This is probably going to be quite useful in general!  The TX call-backs
don't seem to be significant here. (oh, I should probably implement a reset
of the values too).

//.ichael

On Sun, 12 May 2024 at 22:58, Michael Geddes <frog at bunyip.wheelycreek.net>
wrote:

> Yeah - I certainly wasn't going to put a hard limit. Just a log above a
> certain time, that being said, the idea of just collecting stats (being
> able to turn it on via a  "poller timer" set of commands) would be much
> more useful.  I'll look into that.
>
> Average time is probably a good stat - and certainly what we care about.
>
> I actually am hopeful that those couple of things I did might help reduce
> that average time quite a bit (that short-cutting the isotp protocol
> handling especially).
>
> That p/r with logging changes might help reduce the unproductive log time
> further, but also makes it possible to turn on the poller logging without
> the RX task logs kicking in.
>
> //.ichael
>
>
> On Sun, 12 May 2024 at 22:29, Michael Balzer via OvmsDev <
> ovmsdev at lists.openvehicles.com> wrote:
>
>> Warning / gathering debug statistics about slow processing can be
>> helpful, but there must not be a hard limit. Frame/poll response processing
>> may need disk or network I/O, and the vehicle task may be starving from
>> punctual high loads on higher priority tasks (e.g. networking) or by
>> needing to wait for some semaphore -- that's outside the application's
>> control, and must not lead to termination/recreation of the task (in case
>> you're heading towards that direction).
>>
>> I have no idea how much processing time the current vehicles actually
>> need in their respective worst cases. Your draft is probably too lax, poll
>> responses and frames normally need to be processed much faster. I'd say 10
>> ms is already too slow, but any wait for a queue/semaphore will already
>> mean at least 10 ms (FreeRTOS tick). Probably best to begin with just
>> collecting stats.
>>
>> Btw, to help in narrowing down the actual problem case, the profiler
>> could collect max times per RX message ID.
>>
>> Regards,
>> Michael
>>
>>
>> Am 12.05.24 um 10:41 schrieb Michael Geddes:
>>
>> I have a question for Michael B  (or whoever) - I have a commit lined up
>> that would add a bit of a time check to the poller loop. What do we expect
>> the maximum time to execute a poller loop command should be?
>> This is a rough idea (in ms) I have.. based on nothing much really, so
>> any ideas would be appreciated:
>>     int TardyMaxTime_ms(OvmsPoller::OvmsPollEntryType entry_type)
>>       {
>>       switch (entry_type)
>>         {
>>         case OvmsPoller::OvmsPollEntryType::Poll: return 80;
>>         case OvmsPoller::OvmsPollEntryType::FrameRx: return 30;
>>         case OvmsPoller::OvmsPollEntryType::FrameTx: return 20;
>>         case OvmsPoller::OvmsPollEntryType::Command: return 10;
>>         case OvmsPoller::OvmsPollEntryType::PollState: return 15;
>>         default: return 80;
>>         }
>>       }
>>
>> //.ichael
>>
>> On Mon, 6 May 2024 at 07:45, Michael Geddes <frog at bunyip.wheelycreek.net>
>> wrote:
>>
>>> I realise that I was only using the standard cable to test - which
>>> probably is not sufficient - I haven't looked closely at how the Leaf OBD
>>> to Db9 cable is different from standard.
>>>
>>> Ah, my bad out the queue length.   We are definitely queueing more
>>> messages though.   From my log of when the overflow happened, the poller
>>> was in state 0 which means OFF - ie nothing was being sent!!
>>>
>>> I'll look at the TX message thing - opt in sounds good - though it
>>> shouldn't be playing that much of a part here as the TXs are infrequent in
>>> this case (or zero when the leaf is off or driving) - On the ioniq 5 when
>>> I'm using the HUD - I'm polling quite frequently - multiple times per
>>> second and that seems to be fine!.
>>>
>>> I did find an issue with the throttling .. but  it would still mostly
>>> apply the throttling where it matters, so again, it shouldn't be the
>>> problem (also, we aren't transmitting in the leaf case).
>>>
>>> The change I made to the logging of RX messages showed how many in a row
>>> were dropped... and it was mostly 1 only in a run - which means even if it
>>> is a short time between - that means that the drops are being interleaved
>>> by at least one success!
>>>
>>> Sooo.. I'm still wondering what is going on.  Some things I'm going to
>>> try:
>>>
>>> * If the number of messages on the Can bus (coming in through RX) means
>>> that the queue is slowly getting longer and not quite catching up, then
>>> making the queue longer will help it last longer... but only pushes the
>>> problem down the road.
>>>   - Add 'current queue length' to the poller status  information to see
>>> if this is indeed the case?
>>>   - Add some kind of alert when the queue reaches a % full?
>>> * Once you start overflowing and getting overflow log messages, I wonder
>>> if this is then contributing to the problem.
>>>   - Push the overflow logging into Poller Task which can look at how
>>> many drops occurred since last received item.
>>> * Split up the flags for the poller messages into 2:
>>>    - Messages that are/could be happening in the TX/RX tasks
>>>    - Other noisy messages that always happen in the poller task.
>>>
>>> Thoughts on what else we might measure to figure out what is going on?
>>>
>>> //.ichael
>>>
>>>
>>> On Sun, 5 May 2024, 19:29 Michael Balzer via OvmsDev, <
>>> ovmsdev at lists.openvehicles.com> wrote:
>>>
>>>> Michael,
>>>>
>>>> the queue size isn't in bytes, it's in messages:
>>>>
>>>>  * @param uxQueueLength The maximum number of items that the queue can
>>>> contain.
>>>>  *
>>>>  * @param uxItemSize The number of bytes each item in the queue will
>>>> require.
>>>>
>>>>
>>>> Also, from the time stamps in Dereks log excerpt, there were quite some
>>>> dropped frames in that time window -- at least 23 frames in 40 ms, that's
>>>> bad.
>>>>
>>>> Queue sizes are currently:
>>>>
>>>> CONFIG_OVMS_HW_CAN_RX_QUEUE_SIZE=60
>>>> CONFIG_OVMS_VEHICLE_CAN_RX_QUEUE_SIZE=60
>>>>
>>>> The new poller now channels all TX callbacks through the task queue
>>>> additionally to RX and commands. So setting the queue size to be larger
>>>> than the CAN RX queue size seems appropriate.
>>>>
>>>> Nevertheless, an overflow with more than 60 waiting messages still
>>>> indicates some too long processing time in the vehicle task.
>>>>
>>>> TX callbacks previously were done directly in the CAN context, and no
>>>> current vehicle overrides the empty default handler, so this imposed almost
>>>> no additional overhead. By requiring a queue entry for each TX callback,
>>>> this feature now has a potentially high impact for all vehicles. If passing
>>>> these to the task is actually necessary, it needs to become an opt-in
>>>> feature, so only vehicles subscribing to the callback actually need to cope
>>>> with that additional load & potential processing delays involved.
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20240515/fb14d83d/attachment-0001.htm>