[Ovmsdev] OVMS Poller module/singleton

Sun May 12 22:58:23 HKT 2024

Yeah - I certainly wasn't going to put a hard limit. Just a log above a
certain time, that being said, the idea of just collecting stats (being
able to turn it on via a  "poller timer" set of commands) would be much
more useful.  I'll look into that.

Average time is probably a good stat - and certainly what we care about.

I actually am hopeful that those couple of things I did might help reduce
that average time quite a bit (that short-cutting the isotp protocol
handling especially).

That p/r with logging changes might help reduce the unproductive log time
further, but also makes it possible to turn on the poller logging without
the RX task logs kicking in.

//.ichael

On Sun, 12 May 2024 at 22:29, Michael Balzer via OvmsDev <
ovmsdev at lists.openvehicles.com> wrote:

> Warning / gathering debug statistics about slow processing can be helpful,
> but there must not be a hard limit. Frame/poll response processing may need
> disk or network I/O, and the vehicle task may be starving from punctual
> high loads on higher priority tasks (e.g. networking) or by needing to wait
> for some semaphore -- that's outside the application's control, and must
> not lead to termination/recreation of the task (in case you're heading
> towards that direction).
>
> I have no idea how much processing time the current vehicles actually need
> in their respective worst cases. Your draft is probably too lax, poll
> responses and frames normally need to be processed much faster. I'd say 10
> ms is already too slow, but any wait for a queue/semaphore will already
> mean at least 10 ms (FreeRTOS tick). Probably best to begin with just
> collecting stats.
>
> Btw, to help in narrowing down the actual problem case, the profiler could
> collect max times per RX message ID.
>
> Regards,
> Michael
>
>
> Am 12.05.24 um 10:41 schrieb Michael Geddes:
>
> I have a question for Michael B  (or whoever) - I have a commit lined up
> that would add a bit of a time check to the poller loop. What do we expect
> the maximum time to execute a poller loop command should be?
> This is a rough idea (in ms) I have.. based on nothing much really, so any
> ideas would be appreciated:
>     int TardyMaxTime_ms(OvmsPoller::OvmsPollEntryType entry_type)
>       {
>       switch (entry_type)
>         {
>         case OvmsPoller::OvmsPollEntryType::Poll: return 80;
>         case OvmsPoller::OvmsPollEntryType::FrameRx: return 30;
>         case OvmsPoller::OvmsPollEntryType::FrameTx: return 20;
>         case OvmsPoller::OvmsPollEntryType::Command: return 10;
>         case OvmsPoller::OvmsPollEntryType::PollState: return 15;
>         default: return 80;
>         }
>       }
>
> //.ichael
>
> On Mon, 6 May 2024 at 07:45, Michael Geddes <frog at bunyip.wheelycreek.net>
> wrote:
>
>> I realise that I was only using the standard cable to test - which
>> probably is not sufficient - I haven't looked closely at how the Leaf OBD
>> to Db9 cable is different from standard.
>>
>> Ah, my bad out the queue length.   We are definitely queueing more
>> messages though.   From my log of when the overflow happened, the poller
>> was in state 0 which means OFF - ie nothing was being sent!!
>>
>> I'll look at the TX message thing - opt in sounds good - though it
>> shouldn't be playing that much of a part here as the TXs are infrequent in
>> this case (or zero when the leaf is off or driving) - On the ioniq 5 when
>> I'm using the HUD - I'm polling quite frequently - multiple times per
>> second and that seems to be fine!.
>>
>> I did find an issue with the throttling .. but  it would still mostly
>> apply the throttling where it matters, so again, it shouldn't be the
>> problem (also, we aren't transmitting in the leaf case).
>>
>> The change I made to the logging of RX messages showed how many in a row
>> were dropped... and it was mostly 1 only in a run - which means even if it
>> is a short time between - that means that the drops are being interleaved
>> by at least one success!
>>
>> Sooo.. I'm still wondering what is going on.  Some things I'm going to
>> try:
>>
>> * If the number of messages on the Can bus (coming in through RX) means
>> that the queue is slowly getting longer and not quite catching up, then
>> making the queue longer will help it last longer... but only pushes the
>> problem down the road.
>>   - Add 'current queue length' to the poller status  information to see
>> if this is indeed the case?
>>   - Add some kind of alert when the queue reaches a % full?
>> * Once you start overflowing and getting overflow log messages, I wonder
>> if this is then contributing to the problem.
>>   - Push the overflow logging into Poller Task which can look at how many
>> drops occurred since last received item.
>> * Split up the flags for the poller messages into 2:
>>    - Messages that are/could be happening in the TX/RX tasks
>>    - Other noisy messages that always happen in the poller task.
>>
>> Thoughts on what else we might measure to figure out what is going on?
>>
>> //.ichael
>>
>>
>> On Sun, 5 May 2024, 19:29 Michael Balzer via OvmsDev, <
>> ovmsdev at lists.openvehicles.com> wrote:
>>
>>> Michael,
>>>
>>> the queue size isn't in bytes, it's in messages:
>>>
>>>  * @param uxQueueLength The maximum number of items that the queue can
>>> contain.
>>>  *
>>>  * @param uxItemSize The number of bytes each item in the queue will
>>> require.
>>>
>>>
>>> Also, from the time stamps in Dereks log excerpt, there were quite some
>>> dropped frames in that time window -- at least 23 frames in 40 ms, that's
>>> bad.
>>>
>>> Queue sizes are currently:
>>>
>>> CONFIG_OVMS_HW_CAN_RX_QUEUE_SIZE=60
>>> CONFIG_OVMS_VEHICLE_CAN_RX_QUEUE_SIZE=60
>>>
>>> The new poller now channels all TX callbacks through the task queue
>>> additionally to RX and commands. So setting the queue size to be larger
>>> than the CAN RX queue size seems appropriate.
>>>
>>> Nevertheless, an overflow with more than 60 waiting messages still
>>> indicates some too long processing time in the vehicle task.
>>>
>>> TX callbacks previously were done directly in the CAN context, and no
>>> current vehicle overrides the empty default handler, so this imposed almost
>>> no additional overhead. By requiring a queue entry for each TX callback,
>>> this feature now has a potentially high impact for all vehicles. If passing
>>> these to the task is actually necessary, it needs to become an opt-in
>>> feature, so only vehicles subscribing to the callback actually need to cope
>>> with that additional load & potential processing delays involved.
>>>
>>> Regards,
>>> Michael
>>>
>>>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20240512/22e3bb60/attachment-0001.htm>