[Ovmsdev] Moving stuff to SPI RAM

Mark Webb-Johnson mark at webb-johnson.net
Tue Feb 20 10:41:04 HKT 2018


Here is the Espressif document on it:

http://esp-idf.readthedocs.io/en/latest/api-guides/external-ram.html <http://esp-idf.readthedocs.io/en/latest/api-guides/external-ram.html>

Restrictions

The use of external RAM has a few restrictions:
When disabling flash cache (for example, because the flash is being written to), the external RAM also becomes inaccessible; any reads from or writes to it will lead to an illegal cache access exception. This is also the reason that ESP-IDF will never allocate a tasks stack in external RAM.
External RAM cannot be used as a place to store DMA transaction descriptors or as a buffer for a DMA transfer to read from or write into. Any buffers that will be used in combination with DMA must be allocated using heap_caps_malloc(size, MALLOC_CAP_DMA) (and can be freed using a standard free() call.)
External RAM uses the same cache region as the external flash. This means that often accessed variables in external RAM can be read and modified almost as quickly as in internal ram. However, when accessing large chunks of data (>32K), the cache can be insufficient and speeds will fall back to the access speed of the external RAM. Moreover, accessing large chunks of data can ‘push out’ cached flash, possibly making execution of code afterwards slower.
External RAM cannot be used as task stack memory; because of this, xTaskCreate and similar functions will always allocate internal memory for stack and task TCBs and xTaskCreateStatic-type functions will check if the buffers passed are internal. However, for tasks not calling on code in ROM in any way, directly or indirectly, the menuconfig option SPIRAM_ALLOW_STACK_EXTERNAL_MEMORY <http://esp-idf.readthedocs.io/en/latest/api-reference/kconfig.html#config-spiram-allow-stack-external-memory> will eliminate the check in xTaskCreateStatic, allowing task stack in external RAM. Using this is not advised, however.

The way it works is that SPIRAM is accessed using SPI protocol at 40Mhz. The same speed and protocol as for flash memory for code. Then, a portion of internal ram is reserved as a cache (for both SPI ram and flash). Think of it as something like swap.

Hard to guess as to performance impact, as it depends on cache hit rates. I would expect it to behave approximately the same as flash code access (same protocol, same bus speed).

Regards, Mark.

> On 20 Feb 2018, at 10:36 AM, Greg D. <gregd2350 at gmail.com> wrote:
> 
> Probably worth trying, at least.  That was kind of what I was thinking we would do. 
> 
> I think the only storage I access really frequently is already procedure-local, so should be on the stack.  Are there any restrictions on doing I/O to/from SPIRAM?  For example, I/O buffers     aimed at a CAN bus?
> 
> For the proverbial back-of-the-envelope purposes, what is the access speed for SPIRAM?  Also, what access granularity (i.e. is there a fundamental "block" that gets transacted, or is this a sizeof-level thing)?
> 
> Greg
> 
> 
> Mark Webb-Johnson wrote:
>> 
>> Another interesting approach:
>> 
>> main/ovms.cpp:
>> 
>> void* operator new(std::size_t sz)
>>   {
>>   return ExternalRamMalloc(sz);
>>   }
>> 
>> OVMS > module memory
>> Free 8-bit 197400/282424, 32-bit 424/27596, SPIRAM 4113632/4194252
>> --Task--     Total DRAM D/IRAM   IRAM SPIRAM   +/- DRAM D/IRAM   IRAM SPIRAM
>> esp_timer         17068      0    644  35676     +17068     +0   +644 +35676
>> main              12452      0      0   5352     +12452     +0     +0  +5352
>> Housekeeping      28404      0      0  17292     +28404     +0     +0 +17292
>> tiT                   0      0      0    132         +0     +0     +0   +132
>> AsyncConsole          0      0  26404     20         +0     +0 +26404    +20
>> no task            5348      0      0      0      +5348     +0     +0     +0
>> ipc0              10848      0      0      0     +10848     +0     +0     +0
>> ipc1                 12      0      0      0        +12     +0     +0     +0
>> Tmr Svc            7328      0      0      0      +7328     +0     +0     +0
>> 
>> With WIFI enabled and connected to an access point:
>> 
>> OVMS > module memory
>> Free 8-bit 163600/282424, 32-bit 424/27596, SPIRAM 4110952/4194252
>> 
>> That is a global override for all C++ objects to be allocated from SPIRAM. I bet you haven’t seen so much free internal RAM on an ESP32 before...
>> 
>> There are also some C code malloc’s that we could move over as well (the most obvious being javascript duktape, of course, but also mongoose).
>> 
>> I’m wondering if there is any reason not to simply do this global override for C++ code? Any stuff that won’t work in SPIRAM could explicitly malloc what it needs.
>> 
>> Regards, Mark.
>> 
>>> On 17 Feb 2018, at 9:41 PM, Mark Webb-Johnson <mark at webb-johnson.net <mailto:mark at webb-johnson.net>> wrote:
>>> 
>>> There is a significant performance hit using SPI vs internal ram. There are also restrictions (such as no stacks, no dma targets, ISRs, etc). I’ve tried just changing Malloc to use SPI ram but the Espressif idf libraries don’t work. Maybe in 3-6 months, but not today.
>>> 
>>> I still have to solve the problem of std:: objects (strings, etc). I think a new c++ memory allocator should work.
>>> 
>>> Regards, Mark
>>> 
>>>> On 17 Feb 2018, at 9:33 AM, Stephen Casner <casner at acm.org <mailto:casner at acm.org>> wrote:
>>>> 
>>>> Mark,
>>>> 
>>>> This looks good to me (LGTM), but there might be cases where we need
>>>> to avoid adding the new ExternalRamAllocated class as a base for a
>>>> building-block class and instead add it as a base of a subset of the
>>>> classes that derive from the building-block class.
>>>> 
>>>> If there isn't any significant performance hit going to SPIRAM, then I
>>>> expect most allocations other than stacks and DMA buffers can go
>>>> there.
>>>> 
>>>>                                                       -- Steve
>>>> 
>>>>> On Fri, 16 Feb 2018, Mark Webb-Johnson wrote:
>>>>> 
>>>>> 
>>>>> I’ve committed (and pushed) an experimental extension to allow C++ objects to be optionally moved to SPI RAM.
>>>>> 
>>>>> The code is in ovms.h, so should be easily accessible to everything. It is pretty simple:
>>>>> 
>>>>> We have a new class ‘ExternalRamAllocated’. That does nothing except override the new and new[] operators.
>>>>> 
>>>>> Those operators try to malloc from SPI RAM first. If that doesn’t succeed then they fall back to standard internal RAM.
>>>>> 
>>>>> The definition of a C++ class can then be changed to make it “: public ExternalRamAllocated”. Once that is done, any objects of that class allocated will try to be placed in external (SPI) RAM.
>>>>> 
>>>>> For other malloc uses, a general purpose 'void* ExternalRamMalloc(std::size_t sz)’ function is also provided.
>>>>> 
>>>>> To test this, I’ve made a one line change to the OvmsCommand class:
>>>>> 
>>>>> -class OvmsCommand
>>>>> +class OvmsCommand : public ExternalRamAllocated
>>>>> 
>>>>> Here is what the memory usage looks like:
>>>>> 
>>>>> With SPI RAM disabled (in menuconfig):
>>>>> 
>>>>> Free 8-bit 120844/284304, 32-bit 30508/57680, SPIRAM 0/0
>>>>> --Task--     Total DRAM D/IRAM   IRAM SPIRAM   +/- DRAM D/IRAM   IRAM SPIRAM
>>>>> no task            5312      0      0      0      +5312     +0     +0     +0
>>>>> esp_timer         52328      0    644      0     +52328     +0   +644     +0
>>>>> main              16448      0      0      0     +16448     +0     +0     +0
>>>>> ipc0              11096      0      0      0     +11096     +0     +0     +0
>>>>> Housekeeping      40576   5120      0      0     +40576  +5120     +0     +0
>>>>> tiT                 128      0      0      0       +128     +0     +0     +0
>>>>> Tmr Svc             884   6444      0      0       +884  +6444     +0     +0
>>>>> ipc1                 12      0      0      0        +12     +0     +0     +0
>>>>> AsyncConsole         20      0  26404      0        +20     +0 +26404     +0
>>>>> 
>>>>> Without deriving OvmsCommand from ExternalRamAllocated:
>>>>> 
>>>>> Free 8-bit 119240/282436, 32-bit 424/27596, SPIRAM 4193924/4194252
>>>>> --Task--     Total DRAM D/IRAM   IRAM SPIRAM   +/- DRAM D/IRAM   IRAM SPIRAM
>>>>> tiT                   0      0      0    128         +0     +0     +0   +128
>>>>> Housekeeping      40564   5120      0     12     +40564  +5120     +0    +12
>>>>> no task            5348      0      0      0      +5348     +0     +0     +0
>>>>> esp_timer         52328      0    644      0     +52328     +0   +644     +0
>>>>> main              16448      0      0      0     +16448     +0     +0     +0
>>>>> ipc0              11096      0      0      0     +11096     +0     +0     +0
>>>>> ipc1                 12      0      0      0        +12     +0     +0     +0
>>>>> Tmr Svc             884   6444      0      0       +884  +6444     +0     +0
>>>>> AsyncConsole         20      0  26404      0        +20     +0 +26404     +0
>>>>> 
>>>>> After deriving OvmsCommand from ExternalRamAllocated:
>>>>> 
>>>>> OVMS > module memory
>>>>> Free 8-bit 152308/282432, 32-bit 424/27596, SPIRAM 4160852/4194252
>>>>> --Task--     Total DRAM D/IRAM   IRAM SPIRAM   +/- DRAM D/IRAM   IRAM SPIRAM
>>>>> esp_timer         31664      0    644  20664     +31664     +0   +644 +20664
>>>>> tiT                   0      0      0    128         +0     +0     +0   +128
>>>>> Housekeeping      39636      0      0   6060     +39636     +0     +0  +6060
>>>>> no task            5348      0      0      0      +5348     +0     +0     +0
>>>>> main              16448      0      0      0     +16448     +0     +0     +0
>>>>> ipc0              11096      0      0      0     +11096     +0     +0     +0
>>>>> ipc1                 12      0      0      0        +12     +0     +0     +0
>>>>> Tmr Svc            7328      0      0      0      +7328     +0     +0     +0
>>>>> AsyncConsole         20      0  26404      0        +20     +0 +26404     +0
>>>>> 
>>>>> The advantage of SPI RAM is obvious. 30KB of internal RAM saved with just one line changed in a header file. Most of our other objects like that (metrics, configs, etc) could be equally easily moved to SPI RAM. We can make the decision on a class-by-class basis.
>>>>> 
>>>>> I think this is a pretty good solution. It puts the onus of the decision of whether to put into SPI RAM into the object itself (as presumably the object knows best whether it can actually be put in SPI RAM). It is also extremely simple to define that.
>>>>> 
>>>>> But, it is an experiment. Please let me know what you think.
>>>>> 
>>>>> Regards, Mark.
>>>>> 
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>>> 
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>> 
>> 
>> 
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
> 
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.teslaclub.hk
> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.teslaclub.hk/pipermail/ovmsdev/attachments/20180220/b85094d1/attachment-0001.html>


More information about the OvmsDev mailing list