<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">I second this. A fantastic effort, Michael.<div class=""><br class=""></div><div class="">The log you provided on the GitHub issue looks really helpful, and I see Dimitri has replied:</div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div class="clearfix timeline-comment-header" style="box-sizing: border-box; background-color: rgb(246, 248, 250); border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: rgb(209, 213, 218); border-top-left-radius: 3px; border-top-right-radius: 3px; color: rgb(88, 96, 105); padding-left: 15px; padding-right: 15px; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px;"><h3 class="timeline-comment-header-text text-normal f5" style="box-sizing: border-box; margin-bottom: 0px; margin-top: 0px; font-size: 14px !important; font-weight: 400 !important; max-width: 78%; padding-bottom: 10px; padding-top: 10px;"><span class="css-truncate" style="box-sizing: border-box; font-weight: 600;"><a class="css-truncate-target author text-inherit" data-hovercard-type="user" data-hovercard-url="/hovercards?user_id=32667452" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/dmitry1945" aria-describedby="hovercard-aria-description" style="box-sizing: border-box; color: rgb(88, 96, 105); text-decoration: none; display: inline-block; max-width: 125px; overflow: hidden; text-overflow: ellipsis; vertical-align: top; white-space: nowrap;">dmitry1945</a> </span>commented <a href="https://github.com/espressif/esp-idf#issuecomment-441391665" id="issuecomment-441391665-permalink" class="js-timestamp timestamp" style="box-sizing: border-box; color: inherit; text-decoration: none; white-space: nowrap;">6 hours ago</a><span class="js-comment-fragment" style="box-sizing: border-box;"><include-fragment class="js-comment-edit-history d-inline" style="box-sizing: border-box; display: inline !important;"></include-fragment></span></h3></div><div class="edit-comment-hide" style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: rgb(36, 41, 46); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; background-color: rgb(255, 255, 255);"><task-lists disabled="" sortable="" style="box-sizing: border-box;" class=""><table class="d-block" style="box-sizing: border-box; border-collapse: collapse; border-spacing: 0px; display: block !important;"><tbody class="d-block" style="box-sizing: border-box; display: block !important;"><tr class="d-block" style="box-sizing: border-box; display: block !important;"><td class="markdown-body comment-body d-block js-comment-body" style="box-sizing: border-box; padding: 15px; font-size: 14px; line-height: 1.5; word-wrap: break-word; overflow: visible; width: 698px; display: block !important;"><div style="box-sizing: border-box; margin-bottom: 0px !important; margin-top: 0px !important;" class="">Thank you <a class="user-mention" data-hovercard-type="user" data-hovercard-url="/hovercards?user_id=2706753" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/dexterbg" aria-describedby="hovercard-aria-description" style="box-sizing: border-box; color: rgb(36, 41, 46); text-decoration: none; font-weight: 600; white-space: nowrap;">@dexterbg</a>, the place is clear.</div></td></tr></tbody></table></task-lists></div></div></blockquote><div class=""><div><br class=""></div><div>Hopefully Dimitri can find the issue from here. So many little bugs in wifi and bluetooth stacks causing us random issues - it would be helpful to be able to update to the latest IDF.</div><div><br class=""></div><div>Thanks, and Regards,</div><div>Mark.</div><div><br class=""><blockquote type="cite" class=""><div class="">On 25 Nov 2018, at 3:22 AM, Stephen Casner <<a href="mailto:casner@acm.org" class="">casner@acm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">Michael -- Kudos for this Herculean effort -- Steve<br class=""><br class="">On Sat, 24 Nov 2018, Michael Balzer wrote:<br class=""><br class=""><blockquote type="cite" class="">Narrowing this down is a real PITA. The effect sometimes stops to occur, I then need to leave the module powered down for a while to get the effect back.<br class="">Sometimes 1 hour is sufficient, currently it's been off for 2 hours and still works. I had a test window yesterday evening, one this morning and one in the<br class="">afternoon. Temperature is most probably irrelevant, as the last power downs were outside in ~4-5 °C to rule this out.<br class=""><br class="">So as a "passed" can always be a false positive I need to validate every passed step by switching back to a failing version and see if that still fails.<br class=""><br class="">It seems I now have to wait until tomorrow for the next test window, but I have bisected down to this range of just 8 commits:<br class=""><br class="">balzer@leela:~/esp/esp-idf> git rev-list 9d609af54c63e7f949a4fbc43d4f1c13b57f49d8 ^9d2f7c60d9aef9860c61c2756318ada68c80fddf<br class="">9d609af54c63e7f949a4fbc43d4f1c13b57f49d8<br class="">f392727abf7d56490c2f33127a59bfac42c937e0<br class="">e834d6fffc23a6fcfc0d2e871c9235417a7fb48f<br class="">35842d02abb5f574aaab466d46081a232fdd20a6<br class="">f05f3fbde87a9ce45c6818f71b49cd13888fd457<br class="">a6d6c58ecadb9759a0bacf35cd7332ac641e598d<br class="">321b1e02052de95db60ddce87eecce5f9e04e9b8<br class="">40486c872345584d34949b3ce83f9e956a7eea13<br class=""><br class="">...with 9d609af54c63e7f949a4fbc43d4f1c13b57f49d8 being the last identified bad commit, and 9d2f7c60d9aef9860c61c2756318ada68c80fddf being the last good.<br class=""><br class="">If I should guess now, it's probably one of Dmitry's commits on the wear leveling code.<br class=""><br class="">Regards,<br class="">Michael<br class=""><br class=""><br class="">Am 23.11.18 um 17:20 schrieb Michael Balzer:<br class=""><blockquote type="cite" class="">It's not a timing issue, I've let it reboot about 30 times without any successful mount after the first failure.<br class=""><br class="">Going into bisecting now...<br class=""><br class=""><br class="">Am 23.11.18 um 15:54 schrieb Mark Webb-Johnson:<br class=""><blockquote type="cite" class=""><blockquote type="cite" class="">It may actually not be a corruption of the filesystem but some timing issue on the mount procedure. To test that we could disable the auto formatting on<br class="">mount failures.<br class=""></blockquote><br class="">True.<br class=""><br class="">A couple of Espressif guys have jumped on the issue, and I have provided some more information for them. I think key will be reproducing it.<br class=""><br class=""><blockquote type="cite" class="">The issue may also be dependant on the hardware version, i.e. it could be caused by the bug that caused the SD speed issue on the first 3.1 batch.<br class=""></blockquote><br class="">That was definitely a hardware issue with the CP2102 chip. I don't think related to ESP in any way.<br class=""><br class="">Regards, Mark.<br class=""><br class=""><blockquote type="cite" class="">On 23 Nov 2018, at 10:34 PM, Michael Balzer <<a href="mailto:dexter@expeedo.de" class="">dexter@expeedo.de</a> <<a href="mailto:dexter@expeedo.de" class="">mailto:dexter@expeedo.de</a>>> wrote:<br class=""><br class="">It may actually not be a corruption of the filesystem but some timing issue on the mount procedure. To test that we could disable the auto formatting on<br class="">mount failures.<br class=""><br class="">The issue may also be dependant on the hardware version, i.e. it could be caused by the bug that caused the SD speed issue on the first 3.1 batch.<br class=""><br class="">I only have tried the idf update on my batch 1 module (my bench / development module). I think most of our edge testers also have that version.<br class=""><br class="">Regards,<br class="">Michael<br class=""><br class=""><br class="">Am 23.11.18 um 02:32 schrieb Mark Webb-Johnson:<br class=""><blockquote type="cite" class="">I have raised the following github issue to Espressif:<br class=""><br class=""> <a href="https://github.com/espressif/esp-idf/issues/2730" class="">https://github.com/espressif/esp-idf/issues/2730</a><br class=""><br class=""><br class=""> Environment<br class=""><br class=""> * Development Kit: none<br class=""> * Kit version (for WroverKit/PicoKit/DevKitC): none<br class=""> * Module or chip used: ESP32-WROVER 16MB<br class=""> * IDF version (run |git describe --tags| to find it): v3.2-beta1-208-g0d7f2d77c<br class=""> * Build System: make<br class=""> * Compiler version (run |xtensa-esp32-elf-gcc --version| to find it): (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 5.2.0<br class=""> * Operating System: macOS<br class=""> * Power Supply: USB<br class=""><br class=""><br class=""> Problem Description<br class=""><br class=""> TLDR: Between May and July 2018 a change was made to esp idf master that is causing corruption on FAT filesystems mounted on SPI flash.<br class=""><br class=""> Our project uses a partitions.csv as follows:<br class=""><br class=""> |# Name, Type, SubType, Offset, Size nvs, data, nvs, 0x9000, 0x4000 otadata, data, ota, 0xd000, 0x2000 phy_init, data, phy, 0xf000, 0x1000 factory,<br class=""> app, factory, 0x10000, 4M ota_0, app, ota_0, , 4M ota_1, app, ota_1, , 4M store, data, fat, , 1M |<br class=""><br class=""> The 'store' partition is formatted as FAT, as follows:<br class=""><br class=""> esp_vfs_fat_mount_config_t m_store_fat;<br class=""> wl_handle_t m_store_wlh;<br class=""> memset(&m_store_fat,0,sizeof(esp_vfs_fat_sdmmc_mount_config_t));<br class=""> m_store_fat.format_if_mount_failed = true;<br class=""> m_store_fat.max_files = 5;<br class=""> esp_vfs_fat_spiflash_mount("/store", "store", &m_store_fat, &m_store_wlh);<br class=""><br class=""> We have previously used a clone of esp idf master, dated around May 22 2018, without issues. The partition is very reliable.<br class=""><br class=""> However, on Jul 6 2018, we updated our clone to use the latest esp idf master at that time. Shortly afterwards, users started to report that their<br class=""> 'store' filesystem contents were corrupted. We rolled back.<br class=""><br class=""> We have now tried again (updating on Oct 20 2018 to v3.2-beta1-208-g0d7f2d77c) and immediately had the same issue. Random corruption of FAT filesystem<br class=""> in SPI flash.<br class=""><br class=""><br class=""> Expected Behavior<br class=""><br class=""> No corruption of FAT filesystem.<br class=""><br class=""><br class=""> Actual Behavior<br class=""><br class=""> Corruption of FAT filesystem.<br class=""><br class=""><br class=""> Steps to reproduce<br class=""><br class=""> 1. Create a partition in SPI flash, and mount FAT filesystem<br class=""> 2. Read and write to files on FAT filesystem<br class=""> 3. Reboot<br class=""> 4. Observe random corruption and unmountable filesystem<br class=""><br class=""><br class=""> Code to reproduce this issue<br class=""><br class=""> esp_vfs_fat_mount_config_t m_store_fat;<br class=""> wl_handle_t m_store_wlh;<br class=""> memset(&m_store_fat,0,sizeof(esp_vfs_fat_sdmmc_mount_config_t));<br class=""> m_store_fat.format_if_mount_failed = true;<br class=""> m_store_fat.max_files = 5;<br class=""> esp_vfs_fat_spiflash_mount("/store", "store", &m_store_fat, &m_store_wlh);<br class=""><br class=""><br class=""> Debug Logs<br class=""><br class=""> n/a<br class=""><br class=""><br class=""> Other items if possible<br class=""><br class=""> Please advise if you need anything further.<br class=""><br class=""><br class="">I think the timeline is correct (the issue is in esp idf master some time between May and July 2018), but please let me know if you know differently (or<br class="">update the github issue with your comments).<br class=""><br class="">Regards, Mark<br class=""><br class=""><blockquote type="cite" class="">On 23 Nov 2018, at 6:19 AM, Michael Balzer <<a href="mailto:dexter@expeedo.de" class="">dexter@expeedo.de</a> <<a href="mailto:dexter@expeedo.de" class="">mailto:dexter@expeedo.de</a>>> wrote:<br class=""><br class="">esp-idf and OVMS branches are back to the working version.<br class=""><br class="">In case you also lost your config: I also just fixed a bug on restoring into an empty /store partition.<br class=""><br class="">Regards,<br class="">Michael<br class=""><br class=""><br class="">Am 22.11.18 um 22:34 schrieb Michael Balzer:<br class=""><blockquote type="cite" class="">See <a href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/165" class="">https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/165</a><br class=""><br class="">I'll reset both master branches now.<br class=""><br class="">If you're about to pull, please wait until I've reverted the branches.<br class=""><br class="">Regards,<br class="">Michael<br class=""><br class=""></blockquote><br class="">--<br class="">Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal<br class="">Fon 02333 / 833 5735 * Handy 0176 / 206 989 26<br class=""><br class="">_______________________________________________<br class="">OvmsDev mailing list<br class=""><a href="mailto:OvmsDev@lists.openvehicles.com" class="">OvmsDev@lists.openvehicles.com</a> <<a href="mailto:OvmsDev@lists.openvehicles.com" class="">mailto:OvmsDev@lists.openvehicles.com</a>><br class=""><a href="http://lists.openvehicles.com/mailman/listinfo/ovmsdev" class="">http://lists.openvehicles.com/mailman/listinfo/ovmsdev</a><br class=""></blockquote><br class=""><br class="">_______________________________________________<br class="">OvmsDev mailing list<br class=""><a href="mailto:OvmsDev@lists.openvehicles.com" class="">OvmsDev@lists.openvehicles.com</a><br class="">http://lists.openvehicles.com/mailman/listinfo/ovmsdev<br class=""></blockquote><br class="">--<br class="">Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal<br class="">Fon 02333 / 833 5735 * Handy 0176 / 206 989 26<br class="">_______________________________________________<br class="">OvmsDev mailing list<br class=""><a href="mailto:OvmsDev@lists.openvehicles.com" class="">OvmsDev@lists.openvehicles.com</a> <<a href="mailto:OvmsDev@lists.openvehicles.com" class="">mailto:OvmsDev@lists.openvehicles.com</a>><br class=""><a href="http://lists.openvehicles.com/mailman/listinfo/ovmsdev" class="">http://lists.openvehicles.com/mailman/listinfo/ovmsdev</a><br class=""></blockquote><br class=""><br class="">_______________________________________________<br class="">OvmsDev mailing list<br class=""><a href="mailto:OvmsDev@lists.openvehicles.com" class="">OvmsDev@lists.openvehicles.com</a><br class="">http://lists.openvehicles.com/mailman/listinfo/ovmsdev<br class=""></blockquote><br class="">--<br class="">Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal<br class="">Fon 02333 / 833 5735 * Handy 0176 / 206 989 26<br class=""><br class="">_______________________________________________<br class="">OvmsDev mailing list<br class=""><a href="mailto:OvmsDev@lists.openvehicles.com" class="">OvmsDev@lists.openvehicles.com</a><br class="">http://lists.openvehicles.com/mailman/listinfo/ovmsdev<br class=""></blockquote><br class="">--<br class="">Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal<br class="">Fon 02333 / 833 5735 * Handy 0176 / 206 989 26<br class=""><br class=""><br class=""></blockquote><br class=""> -- Steve_______________________________________________<br class="">OvmsDev mailing list<br class=""><a href="mailto:OvmsDev@lists.openvehicles.com" class="">OvmsDev@lists.openvehicles.com</a><br class="">http://lists.openvehicles.com/mailman/listinfo/ovmsdev<br class=""></div></div></blockquote></div><br class=""></div></body></html>