[Ovmsdev] Netmanager priority issue solved
Michael Balzer
dexter at expeedo.de
Sat Jul 25 00:00:27 HKT 2020
Fixed. I fell for the same assumption in the OvmsMutex / OvmsRecMutex
implementation.
Regards,
Michael
Am 24.07.20 um 16:02 schrieb Michael Balzer:
> It seems we've got another mutex issue in duktape:
>
> OVMS# mo ta
> Number of Tasks = 22 Stack: Now Max Total Heap 32-bit
> SPIRAM C# PRI CPU% BPR/MH
> 3FFAFB88 1 Blk esp_timer 436 708 4096 40288 644
> 31232 0 22 0% 22/ 0
> 3FFC0E90 2 Blk eventTask 476 1884 4608 104
> 0 0 0 20 0% 20/ 0
> 3FFC3314 3 Blk OVMS Events 704 3360 8192 92364 0
> 35464 1 8 1% 8/ 0
> 3FFC6764 4 Rdy OVMS DukTape 496 10864 12288 580 0
> 189492 1 3 10% 3/_*42*_
>
> …increasing once per minute. I'll have a look.
>
> Regards,
> Michael
>
>
> Am 24.07.20 um 15:27 schrieb Michael Balzer:
>> TL;DR: you need to pull my latest esp-idf changes.
>>
>> I've finally found & solved the strange priority changes for our
>> netmanager task (i.e. being raised suddenly from origin 5 to 18/22): the
>> bug was in the esp-idf posix threads mutex implementation.
>>
>> I had suspected the mutex priority inheritance for a while, so added a
>> way to retrieve the internal mutex hold count for our task list. Using
>> this I noticed the hold count would always & only raise by 2 whenever
>> any kind of mongoose connection was closed.
>>
>> That lead me to checking the thread concurrency protection for mongoose
>> mbufs, because every mongoose connection has two mbufs associated (rx &
>> tx), each having a posix mutex.
>>
>> The bug was: posix mutexes were deleted after locking (taking) them.
>> FreeRTOS mutexes must not be deleted while being taken, that breaks the
>> priority inheritance (more precisely the disinheritance), with the
>> visible effect being the mutex hold count not returning to zero.
>>
>> As a side effect, this may also solve the strange event task starvations
>> (hope so…). I was investigating this as I suspected some busy loop in
>> the netmanager context. With the netman running at prio 22, that would
>> effectively block almost all other processing including the timer
>> service. I've found & fixed one potential busy loop trigger in the
>> netman that would have been caused by the netman task still running
>> while all interfaces had been lost -- not sure if that could happen, but
>> it would explain the effects.
>>
>> So please watch your crash debug info & report if the issue still turns up.
>>
>> Regards,
>> Michael
>>
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20200724/2248f2ec/attachment.htm>
More information about the OvmsDev
mailing list