[Ovmsdev] Memory leak

Mark Webb-Johnson mark at webb-johnson.net
Thu Apr 26 10:23:23 HKT 2018


I’m seeing this also, when I stress-test by sending in a large canbus dump into retools:

CORRUPT HEAP: Bad head at 0x3f80e3c0. Expected 0xabba1234 got 0x3f80eb00
assertion "head != NULL" failed: file "/Users/mark/esp/esp-idf/components/heap/./multi_heap_poisoning.c", line 229, function: multi_heap_realloc
abort() was called at PC 0x401015e7 on core 1
0x401015e7: __assert_func at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)


Backtrace: 0x40092cec:0x3ffe50f0 0x40092ee7:0x3ffe5110 0x401015e7:0x3ffe5130 0x4009294e:0x3ffe5160 0x40084e59:0x3ffe5180 0x400e4a51:0x3ffe51a0 0x400fb78a:0x3ffe51c0 0x400fb7da:0x3ffe51e0 0x400fb7f0:0x3ffe5200 0x400fe51a:0x3ffe5220 0x4013abd9:0x3ffe5260 0x4013c29e:0x3ffe52b0 0x4013c2b9:0x3ffe52f0
0x40092cec: invoke_abort at /Users/mark/esp/esp-idf/components/esp32/./panic.c:669

0x40092ee7: abort at /Users/mark/esp/esp-idf/components/esp32/./panic.c:669

0x401015e7: __assert_func at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)

0x4009294e: multi_heap_realloc at /Users/mark/esp/esp-idf/components/heap/./multi_heap_poisoning.c:326

0x40084e59: heap_caps_realloc at /Users/mark/esp/esp-idf/components/heap/./heap_caps.c:388

0x400e4a51: ExternalRamRealloc at /Users/mark/Documents/ovms/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/main/./ovms_malloc.c:65

0x400fb78a: mbuf_insert at /Users/mark/Documents/ovms/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11403

0x400fb7da: mbuf_append at /Users/mark/Documents/ovms/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11403

0x400fb7f0: mg_socket_if_tcp_send at /Users/mark/Documents/ovms/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11403

0x400fe51a: mg_send at /Users/mark/Documents/ovms/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11403

0x4013abd9: re::DoServe(CAN_frame_t*) at /Users/mark/Documents/ovms/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/retools/src/retools.cpp:312

…

That code path does a lot of realloc’s.

If I switch back to use normal malloc/calloc/realloc, it still crashes.

If I disable the malloc #defined in mongoose mg_locals.h, it still crashes.

The only way I can avoid this is to consume ALL the data on the MG_EV_RECV event, which means realloc is presumably not called (so often?) within mongoose.

I assumed the conclusion is that fundamentally realloc is broken in the ESP IDF version we use.

I tried updating our ESP-IDF to espressif/master, but get compile errors in build/heap:

CC build/heap/multi_heap_poisoning.o
In file included from /Users/mark/esp/esp-idf/components/heap/multi_heap_poisoning.c:27:0:
/Users/mark/esp/esp-idf/components/heap/multi_heap_platform.h:66:32: error: unknown type name 'TaskHandle_t'
 #define MULTI_HEAP_BLOCK_OWNER TaskHandle_t task;
                                ^
/Users/mark/esp/esp-idf/components/heap/multi_heap_poisoning.c:50:5: note: in expansion of macro 'MULTI_HEAP_BLOCK_OWNER'
     MULTI_HEAP_BLOCK_OWNER
     ^
/Users/mark/esp/esp-idf/components/heap/multi_heap_poisoning.c: In function 'poison_allocated_region':
/Users/mark/esp/esp-idf/components/heap/multi_heap_platform.h:67:57: error: implicit declaration of function 'xTaskGetCurrentTaskHandle' [-Werror=implicit-function-declaration]
 #define MULTI_HEAP_SET_BLOCK_OWNER(HEAD) (HEAD)->task = xTaskGetCurrentTaskHandle()
                                                         ^
/Users/mark/esp/esp-idf/components/heap/multi_heap_poisoning.c:71:5: note: in expansion of macro 'MULTI_HEAP_SET_BLOCK_OWNER'
     MULTI_HEAP_SET_BLOCK_OWNER(head);
     ^
/Users/mark/esp/esp-idf/components/heap/multi_heap_poisoning.c: In function 'multi_heap_get_block_owner':
/Users/mark/esp/esp-idf/components/heap/multi_heap_platform.h:68:42: warning: return makes pointer from integer without a cast [-Wint-conversion]
 #define MULTI_HEAP_GET_BLOCK_OWNER(HEAD) ((HEAD)->task)
                                          ^
/Users/mark/esp/esp-idf/components/heap/multi_heap_poisoning.c:284:12: note: in expansion of macro 'MULTI_HEAP_GET_BLOCK_OWNER'
     return MULTI_HEAP_GET_BLOCK_OWNER((poison_head_t*)multi_heap_get_block_address_impl(block));
            ^
cc1: some warnings being treated as errors
make[1]: *** [multi_heap_poisoning.o] Error 1

I then tried:

+CONFIG_HEAP_POISONING_DISABLED=y
+CONFIG_HEAP_POISONING_LIGHT=
-CONFIG_HEAP_TASK_TRACKING=y

But no difference. Still crashes.

Also tried:

-CONFIG_FREERTOS_USE_TRACE_FACILITY=y
-CONFIG_FREERTOS_USE_STATS_FORMATTING_FUNCTIONS=
+CONFIG_FREERTOS_USE_TRACE_FACILITY=

But no difference. Still crashes.

Although not in a simply way:

OVMS# module memory
Free 8-bit 119432/281952, 32-bit 8020/24276, SPIRAM 4141392/4194252

OVMS# test realloc
First check heap integrity...
Now allocate 4KB RAM...
Check heap integrity...
Now re-allocate bigger, 1,000 times...
Check heap integrity...
Now re-allocate smaller, 1,000 times...
Check heap integrity...
And free the buffer...
Final check of heap integrity…

OVMS# module memory
Free 8-bit 119432/281952, 32-bit 8020/24276, SPIRAM 4140892/4194252

Maybe something inside mongoose itself? As that is the only place we are seeing this.

Regards, Mark.

> On 25 Apr 2018, at 3:18 PM, Stephen Casner <casner at acm.org> wrote:
> 
> Done, but now I get a corrupt heap abort.  That does not make sense
> because on my v3.0 hardware the new functions should boil down to
> vanilla malloc, calloc, realloc.  More tomorrow when I'm more awake.
> 
>                                                        -- Steve
> 
> On Wed, 25 Apr 2018, Mark Webb-Johnson wrote:
> 
>> ok
>> 
>> 
>>> On 25 Apr 2018, at 2:34 PM, Stephen Casner <casner at acm.org> wrote:
>>> 
>>> Mark,
>>> 
>>>>> Presumably defining those in mg_locals.h? Then I guess
>>>>> ovms.{h,cpp} needs our ExternalRamMalloc extended with
>>>>> ExternalRamCalloc and ExternalRamRealloc.
>>> 
>>> I already did this, but ran into the problem that ovms.{h,cpp} is C++
>>> code, whereas mongoose is C.  I started down a path of making
>>> duplicate C code just for mongoose, but I was trying to make this
>>> change without modifying mongoose.{h,c}.  I think the right approach
>>> would be to extract ExternalRamMalloc etc. into a separate pair of
>>> files ovms_malloc.{h,c} that are C and then include them from ovms.h.
>>> 
>>>                                                       -- Steve
>>> 
>>> On Wed, 25 Apr 2018, Mark Webb-Johnson wrote:
>>> 
>>>>> 
>>>>> Or shall we just make a ov_malloc, ov_calloc, and ov_realloc, to be easier on the eyes?
>>>> 
>>>> Let me know if you want me to do this first. Should be relatively simple as ESP-IDF provides equivalent functions.
>>>> 
>>>> Regards, Mark.
>>>> 
>>>>> On 25 Apr 2018, at 1:30 PM, Mark Webb-Johnson <mark at webb-johnson.net> wrote:
>>>>> 
>>>>> Steve,
>>>>> 
>>>>> I see:
>>>>> 
>>>>> #ifndef MG_MALLOC
>>>>> #define MG_MALLOC malloc
>>>>> #endif
>>>>> 
>>>>> #ifndef MG_CALLOC
>>>>> #define MG_CALLOC calloc
>>>>> #endif
>>>>> 
>>>>> #ifndef MG_REALLOC
>>>>> #define MG_REALLOC realloc
>>>>> #endif
>>>>> 
>>>>> #ifndef MG_FREE
>>>>> #define MG_FREE free
>>>>> #endif
>>>>> 
>>>>> Presumably defining those in mg_locals.h? Then I guess ovms.{h,cpp} needs our ExternalRamMalloc extended with ExternalRamCalloc and ExternalRamRealloc.
>>>>> 
>>>>> Or shall we just make a ov_malloc, ov_calloc, and ov_realloc, to be easier on the eyes?
>>>>> 
>>>>> Regards, Mark.
>>>>> 
>>>>>> On 25 Apr 2018, at 1:20 PM, Stephen Casner <casner at acm.org <mailto:casner at acm.org>> wrote:
>>>>>> 
>>>>>> Mark,
>>>>>> 
>>>>>> OK, I'll change mongoose to explicitly allocate from SPIRAM when
>>>>>> available.  I won't be able to completely test it, though, since I
>>>>>> only have v3.0.
>>>>>> 
>>>>>>                                                      -- Steve
>>>>>> 
>>>>>> On Wed, 25 Apr 2018, Mark Webb-Johnson wrote:
>>>>>> 
>>>>>>> Steve,
>>>>>>> 
>>>>>>> Because of things like this:
>>>>>>> 
>>>>>>> https://github.com/espressif/esp-idf/issues/1492 <https://github.com/espressif/esp-idf/issues/1492> <https://github.com/espressif/esp-idf/issues/1492 <https://github.com/espressif/esp-idf/issues/1492>>
>>>>>>> 
>>>>>>> I tried setting the "SPI RAM access method" to "Make RAM allocatable using malloc() as well", and reducing "Maximum malloc() size, in bytes, to always put in internal memory" to 32 bytes, but get this crash on startup...
>>>>>>> 
>>>>>>> Presumably that is esp_event_loop_init trying to create a FreeRTOS queue, and then objecting because it is not in internal RAM? It seems that the ESP IDF framework is not fully working with SPI RAM yet (due to inherent limitations). It would be better if the memory to be allocated must be internal, it is specifically allocated as internal (rather than just rely on a generic malloc).
>>>>>>> 
>>>>>>> Last time I tried it (earlier this year), the ESP-IDF just simply didn't work when that menuconfig option was set. When I fixed the above specific bug, it happened in another place.
>>>>>>> 
>>>>>>> I know Espressif have been working on it, and perhaps they've fixed it now? This commit seems to try to address it in a generic way (at least for the freertos port stuff, if not for all the other libraries that we use):
>>>>>>> 
>>>>>>> https://github.com/espressif/esp-idf/commit/16de6bff245dec5e63eee994f53a08252be720d4 <https://github.com/espressif/esp-idf/commit/16de6bff245dec5e63eee994f53a08252be720d4> <https://github.com/espressif/esp-idf/commit/16de6bff245dec5e63eee994f53a08252be720d4 <https://github.com/espressif/esp-idf/commit/16de6bff245dec5e63eee994f53a08252be720d4>>
>>>>>>> 
>>>>>>> IDF 3.0 has officially been released last night.
>>>>>>> 
>>>>>>> Regards, Mark.
>>>>>>> 
>>>>>>>> On 25 Apr 2018, at 4:40 AM, Stephen Casner <casner at acm.org <mailto:casner at acm.org>> wrote:
>>>>>>>> 
>>>>>>>> We've been dancing around the external RAM allocation decision with
>>>>>>>> our own ExtenalRamMalloc() function and extensions to the C++ new
>>>>>>>> allocator to try SPIRAM first but then back off to internal memory if
>>>>>>>> that fails (presumably because of v3.0 hardware rather than because
>>>>>>>> all 4MB of SPIRAM is used up).
>>>>>>>> 
>>>>>>>> Do we need all that?  There is already a mechanism in the multi-heap
>>>>>>>> memory system for the defaut malloc to try allocating from SPIRAM for
>>>>>>>> any request with size larger than malloc_alwaysinternal_limit which
>>>>>>>> can be set dynamically with heap_caps_malloc_extmem_enable().  If the
>>>>>>>> allocation from SPIRAM fails (again, presumably only on v3.0 hardware)
>>>>>>>> then the allocation is retried without the MALLOC_CAP_SPIRAM
>>>>>>>> capability, which is just what our ExternalRamMalloc does.
>>>>>>>> 
>>>>>>>> If there are only a few allocations such as stacks that must come from
>>>>>>>> internal memory, and if those allocations (in system code) already use
>>>>>>>> heap_caps_malloc(MALLOC_CAP_INTERNAL) to meet that requirement, then
>>>>>>>> we could probably just set malloc_alwaysinternal_limit to a small
>>>>>>>> number (perhaps 0) so that everything else comes from SPIRAM.
>>>>>>>> 
>>>>>>>> I don't know if any of our application memory uses are sufficiently
>>>>>>>> sensitive to memory performance that we would want to ensure that the
>>>>>>>> memory is internal.  My guess is that those would be few and therefore
>>>>>>>> it makes sense to take special action for them rather than for
>>>>>>>> everything else.
>>>>>>>> 
>>>>>>>> For continued support of v3.0 hardware we could even test whether
>>>>>>>> SPIRAM is present and leave malloc_alwaysinternal_limit at its default
>>>>>>>> value of -1 if not.
>>>>>>>> 
>>>>>>>> Opinions?
>>>>>>>> 
>>>>>>>>                                                     -- Steve
>>>>>>>> 
>>>>>>>> On Mon, 23 Apr 2018, Mark Webb-Johnson wrote:
>>>>>>>> 
>>>>>>>>> Michael,
>>>>>>>>> 
>>>>>>>>> Do we need ovms_extram.h? Can we just put the namespace directly
>>>>>>>>> into ovms.h (where all the other external ram allocation stuff is)?
>>>>>>>>> 
>>>>>>>>> Regards, Mark
>>>>>>>> _______________________________________________
>>>>>>>> OvmsDev mailing list
>>>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>> 
>>>> 
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>> 
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>> 
>> 
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180426/35ac0f1b/attachment.htm>


More information about the OvmsDev mailing list