[Ovmsdev] Attention: esp-idf upgrade still breaks configuration

Stephen Casner casner at acm.org
Sun Nov 25 03:22:07 HKT 2018


Michael -- Kudos for this Herculean effort -- Steve

On Sat, 24 Nov 2018, Michael Balzer wrote:

> Narrowing this down is a real PITA. The effect sometimes stops to occur, I then need to leave the module powered down for a while to get the effect back.
> Sometimes 1 hour is sufficient, currently it's been off for 2 hours and still works. I had a test window yesterday evening, one this morning and one in the
> afternoon. Temperature is most probably irrelevant, as the last power downs were outside in ~4-5 °C to rule this out.
>
> So as a "passed" can always be a false positive I need to validate every passed step by switching back to a failing version and see if that still fails.
>
> It seems I now have to wait until tomorrow for the next test window, but I have bisected down to this range of just 8 commits:
>
> balzer at leela:~/esp/esp-idf> git rev-list 9d609af54c63e7f949a4fbc43d4f1c13b57f49d8 ^9d2f7c60d9aef9860c61c2756318ada68c80fddf
> 9d609af54c63e7f949a4fbc43d4f1c13b57f49d8
> f392727abf7d56490c2f33127a59bfac42c937e0
> e834d6fffc23a6fcfc0d2e871c9235417a7fb48f
> 35842d02abb5f574aaab466d46081a232fdd20a6
> f05f3fbde87a9ce45c6818f71b49cd13888fd457
> a6d6c58ecadb9759a0bacf35cd7332ac641e598d
> 321b1e02052de95db60ddce87eecce5f9e04e9b8
> 40486c872345584d34949b3ce83f9e956a7eea13
>
> ...with 9d609af54c63e7f949a4fbc43d4f1c13b57f49d8 being the last identified bad commit, and 9d2f7c60d9aef9860c61c2756318ada68c80fddf being the last good.
>
> If I should guess now, it's probably one of Dmitry's commits on the wear leveling code.
>
> Regards,
> Michael
>
>
> Am 23.11.18 um 17:20 schrieb Michael Balzer:
> > It's not a timing issue, I've let it reboot about 30 times without any successful mount after the first failure.
> >
> > Going into bisecting now...
> >
> >
> > Am 23.11.18 um 15:54 schrieb Mark Webb-Johnson:
> >>> It may actually not be a corruption of the filesystem but some timing issue on the mount procedure. To test that we could disable the auto formatting on
> >>> mount failures.
> >>
> >> True.
> >>
> >> A couple of Espressif guys have jumped on the issue, and I have provided some more information for them. I think key will be reproducing it.
> >>
> >>> The issue may also be dependant on the hardware version, i.e. it could be caused by the bug that caused the SD speed issue on the first 3.1 batch.
> >>
> >> That was definitely a hardware issue with the CP2102 chip. I don't think related to ESP in any way.
> >>
> >> Regards, Mark.
> >>
> >>> On 23 Nov 2018, at 10:34 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
> >>>
> >>> It may actually not be a corruption of the filesystem but some timing issue on the mount procedure. To test that we could disable the auto formatting on
> >>> mount failures.
> >>>
> >>> The issue may also be dependant on the hardware version, i.e. it could be caused by the bug that caused the SD speed issue on the first 3.1 batch.
> >>>
> >>> I only have tried the idf update on my batch 1 module (my bench / development module). I think most of our edge testers also have that version.
> >>>
> >>> Regards,
> >>> Michael
> >>>
> >>>
> >>> Am 23.11.18 um 02:32 schrieb Mark Webb-Johnson:
> >>>> I have raised the following github issue to Espressif:
> >>>>
> >>>>     https://github.com/espressif/esp-idf/issues/2730
> >>>>
> >>>>
> >>>>         Environment
> >>>>
> >>>>       * Development Kit: none
> >>>>       * Kit version (for WroverKit/PicoKit/DevKitC): none
> >>>>       * Module or chip used: ESP32-WROVER 16MB
> >>>>       * IDF version (run |git describe --tags| to find it): v3.2-beta1-208-g0d7f2d77c
> >>>>       * Build System: make
> >>>>       * Compiler version (run |xtensa-esp32-elf-gcc --version| to find it): (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 5.2.0
> >>>>       * Operating System: macOS
> >>>>       * Power Supply: USB
> >>>>
> >>>>
> >>>>         Problem Description
> >>>>
> >>>>     TLDR: Between May and July 2018 a change was made to esp idf master that is causing corruption on FAT filesystems mounted on SPI flash.
> >>>>
> >>>>     Our project uses a partitions.csv as follows:
> >>>>
> >>>>     |# Name, Type, SubType, Offset, Size nvs, data, nvs, 0x9000, 0x4000 otadata, data, ota, 0xd000, 0x2000 phy_init, data, phy, 0xf000, 0x1000 factory,
> >>>>     app, factory, 0x10000, 4M ota_0, app, ota_0, , 4M ota_1, app, ota_1, , 4M store, data, fat, , 1M |
> >>>>
> >>>>     The 'store' partition is formatted as FAT, as follows:
> >>>>
> >>>>     esp_vfs_fat_mount_config_t m_store_fat;
> >>>>     wl_handle_t m_store_wlh;
> >>>>     memset(&m_store_fat,0,sizeof(esp_vfs_fat_sdmmc_mount_config_t));
> >>>>     m_store_fat.format_if_mount_failed = true;
> >>>>     m_store_fat.max_files = 5;
> >>>>     esp_vfs_fat_spiflash_mount("/store", "store", &m_store_fat, &m_store_wlh);
> >>>>
> >>>>     We have previously used a clone of esp idf master, dated around May 22 2018, without issues. The partition is very reliable.
> >>>>
> >>>>     However, on Jul 6 2018, we updated our clone to use the latest esp idf master at that time. Shortly afterwards, users started to report that their
> >>>>     'store' filesystem contents were corrupted. We rolled back.
> >>>>
> >>>>     We have now tried again (updating on Oct 20 2018 to v3.2-beta1-208-g0d7f2d77c) and immediately had the same issue. Random corruption of FAT filesystem
> >>>>     in SPI flash.
> >>>>
> >>>>
> >>>>           Expected Behavior
> >>>>
> >>>>     No corruption of FAT filesystem.
> >>>>
> >>>>
> >>>>           Actual Behavior
> >>>>
> >>>>     Corruption of FAT filesystem.
> >>>>
> >>>>
> >>>>           Steps to reproduce
> >>>>
> >>>>      1. Create a partition in SPI flash, and mount FAT filesystem
> >>>>      2. Read and write to files on FAT filesystem
> >>>>      3. Reboot
> >>>>      4. Observe random corruption and unmountable filesystem
> >>>>
> >>>>
> >>>>           Code to reproduce this issue
> >>>>
> >>>>     esp_vfs_fat_mount_config_t m_store_fat;
> >>>>     wl_handle_t m_store_wlh;
> >>>>     memset(&m_store_fat,0,sizeof(esp_vfs_fat_sdmmc_mount_config_t));
> >>>>     m_store_fat.format_if_mount_failed = true;
> >>>>     m_store_fat.max_files = 5;
> >>>>     esp_vfs_fat_spiflash_mount("/store", "store", &m_store_fat, &m_store_wlh);
> >>>>
> >>>>
> >>>>         Debug Logs
> >>>>
> >>>>     n/a
> >>>>
> >>>>
> >>>>         Other items if possible
> >>>>
> >>>>     Please advise if you need anything further.
> >>>>
> >>>>
> >>>> I think the timeline is correct (the issue is in esp idf master some time between May and July 2018), but please let me know if you know differently (or
> >>>> update the github issue with your comments).
> >>>>
> >>>> Regards, Mark
> >>>>
> >>>>> On 23 Nov 2018, at 6:19 AM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
> >>>>>
> >>>>> esp-idf and OVMS branches are back to the working version.
> >>>>>
> >>>>> In case you also lost your config:  I also just fixed a bug on restoring into an empty /store partition.
> >>>>>
> >>>>> Regards,
> >>>>> Michael
> >>>>>
> >>>>>
> >>>>> Am 22.11.18 um 22:34 schrieb Michael Balzer:
> >>>>>> See https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/165
> >>>>>>
> >>>>>> I'll reset both master branches now.
> >>>>>>
> >>>>>> If you're about to pull, please wait until I've reverted the branches.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Michael
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> >>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
> >>>>>
> >>>>> _______________________________________________
> >>>>> OvmsDev mailing list
> >>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
> >>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> OvmsDev mailing list
> >>>> OvmsDev at lists.openvehicles.com
> >>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
> >>>
> >>> --
> >>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> >>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
> >>> _______________________________________________
> >>> OvmsDev mailing list
> >>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
> >>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
> >>
> >>
> >> _______________________________________________
> >> OvmsDev mailing list
> >> OvmsDev at lists.openvehicles.com
> >> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
> >
> > --
> > Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> > Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
> >
> > _______________________________________________
> > OvmsDev mailing list
> > OvmsDev at lists.openvehicles.com
> > http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
>

                                                        -- Steve


More information about the OvmsDev mailing list