Dimitris question about
unclean shutdowns let me check our
reboot code, and we actually do not
take care of unmounting the config
store.
While my crash observations do not
look like crash/clean makes a
difference to this bug, I still think
we should do that.
The simple solution would be to do
this last, in
Boot::boot_shutdown_done() just before
the esp_restart(), but that does not
take care of crashes during shutdown,
which still are too frequent.
How about unmounting /store ASAP after
the shuttingdown event (i.e. normal
shutdown handling)?
Writing to the config would then have
no effect during shutdown, but do we
actually need (want) that to be
possible?
Regards,
Michael
Am
25.11.18 um 03:12 schrieb Mark
Webb-Johnson:
I second this. A fantastic effort,
Michael.
The log you provided
on the GitHub issue looks really
helpful, and I see Dimitri has
replied:
Hopefully Dimitri
can find the issue from here. So
many little bugs in wifi and
bluetooth stacks causing us
random issues - it would be
helpful to be able to update to
the latest IDF.
Thanks, and Regards,
Mark.
Michael --
Kudos for this Herculean
effort -- Steve
On Sat, 24 Nov 2018,
Michael Balzer wrote:
Narrowing this
down is a real PITA. The
effect sometimes stops
to occur, I then need to
leave the module powered
down for a while to get
the effect back.
Sometimes 1 hour is
sufficient, currently
it's been off for 2
hours and still works. I
had a test window
yesterday evening, one
this morning and one in
the
afternoon. Temperature
is most probably
irrelevant, as the last
power downs were outside
in ~4-5 °C to rule this
out.
So as a "passed" can
always be a false
positive I need to
validate every passed
step by switching back
to a failing version and
see if that still fails.
It seems I now have to
wait until tomorrow for
the next test window,
but I have bisected down
to this range of just 8
commits:
balzer@leela:~/esp/esp-idf> git rev-list
9d609af54c63e7f949a4fbc43d4f1c13b57f49d8
^9d2f7c60d9aef9860c61c2756318ada68c80fddf
9d609af54c63e7f949a4fbc43d4f1c13b57f49d8
f392727abf7d56490c2f33127a59bfac42c937e0
e834d6fffc23a6fcfc0d2e871c9235417a7fb48f
35842d02abb5f574aaab466d46081a232fdd20a6
f05f3fbde87a9ce45c6818f71b49cd13888fd457
a6d6c58ecadb9759a0bacf35cd7332ac641e598d
321b1e02052de95db60ddce87eecce5f9e04e9b8
40486c872345584d34949b3ce83f9e956a7eea13
...with
9d609af54c63e7f949a4fbc43d4f1c13b57f49d8
being the last
identified bad commit,
and
9d2f7c60d9aef9860c61c2756318ada68c80fddf
being the last good.
If I should guess now,
it's probably one of
Dmitry's commits on the
wear leveling code.
Regards,
Michael
Am 23.11.18 um 17:20
schrieb Michael Balzer:
It's not a
timing issue, I've let
it reboot about 30
times without any
successful mount after
the first failure.
Going into bisecting
now...
Am 23.11.18 um 15:54
schrieb Mark
Webb-Johnson:
It may
actually not be a
corruption of the
filesystem but
some timing issue
on the mount
procedure. To test
that we could
disable the auto
formatting on
mount failures.
True.
A couple of
Espressif guys have
jumped on the issue,
and I have provided
some more
information for
them. I think key
will be reproducing
it.
The issue
may also be
dependant on the
hardware version,
i.e. it could be
caused by the bug
that caused the SD
speed issue on the
first 3.1 batch.
That was definitely
a hardware issue
with the CP2102
chip. I don't think
related to ESP in
any way.
Regards, Mark.
On 23 Nov
2018, at 10:34 PM,
Michael Balzer
<dexter@expeedo.de
<mailto:dexter@expeedo.de>>
wrote:
It may actually
not be a
corruption of the
filesystem but
some timing issue
on the mount
procedure. To test
that we could
disable the auto
formatting on
mount failures.
The issue may also
be dependant on
the hardware
version, i.e. it
could be caused by
the bug that
caused the SD
speed issue on the
first 3.1 batch.
I only have tried
the idf update on
my batch 1 module
(my bench /
development
module). I think
most of our edge
testers also have
that version.
Regards,
Michael
Am 23.11.18 um
02:32 schrieb Mark
Webb-Johnson:
I have
raised the
following github
issue to
Espressif:
https://github.com/espressif/esp-idf/issues/2730
Environment
*
Development Kit:
none
* Kit
version (for
WroverKit/PicoKit/DevKitC):
none
* Module or
chip used:
ESP32-WROVER
16MB
* IDF
version (run
|git describe
--tags| to find
it):
v3.2-beta1-208-g0d7f2d77c
* Build
System: make
* Compiler
version (run
|xtensa-esp32-elf-gcc
--version| to
find it):
(crosstool-NG
crosstool-ng-1.22.0-80-g6c4433a)
5.2.0
* Operating
System: macOS
* Power
Supply: USB
Problem
Description
TLDR: Between
May and July
2018 a change
was made to esp
idf master that
is causing
corruption on
FAT filesystems
mounted on SPI
flash.
Our project
uses a
partitions.csv
as follows:
|# Name,
Type, SubType,
Offset, Size
nvs, data, nvs,
0x9000, 0x4000
otadata, data,
ota, 0xd000,
0x2000 phy_init,
data, phy,
0xf000, 0x1000
factory,
app, factory,
0x10000, 4M
ota_0, app,
ota_0, , 4M
ota_1, app,
ota_1, , 4M
store, data,
fat, , 1M |
The 'store'
partition is
formatted as
FAT, as follows:
esp_vfs_fat_mount_config_t m_store_fat;
wl_handle_t
m_store_wlh;
memset(&m_store_fat,0,sizeof(esp_vfs_fat_sdmmc_mount_config_t));
m_store_fat.format_if_mount_failed = true;
m_store_fat.max_files = 5;
esp_vfs_fat_spiflash_mount("/store", "store", &m_store_fat,
&m_store_wlh);
We have
previously used
a clone of esp
idf master,
dated around May
22 2018, without
issues. The
partition is
very reliable.
However, on
Jul 6 2018, we
updated our
clone to use the
latest esp idf
master at that
time. Shortly
afterwards,
users started to
report that
their
'store'
filesystem
contents were
corrupted. We
rolled back.
We have now
tried again
(updating on Oct
20 2018 to
v3.2-beta1-208-g0d7f2d77c)
and immediately
had the same
issue. Random
corruption of
FAT filesystem
in SPI flash.
Expected Behavior
No corruption
of FAT
filesystem.
Actual
Behavior
Corruption of
FAT filesystem.
Steps
to reproduce
1. Create a
partition in SPI
flash, and mount
FAT filesystem
2. Read and
write to files
on FAT
filesystem
3. Reboot
4. Observe
random
corruption and
unmountable
filesystem
Code to
reproduce this
issue
esp_vfs_fat_mount_config_t m_store_fat;
wl_handle_t
m_store_wlh;
memset(&m_store_fat,0,sizeof(esp_vfs_fat_sdmmc_mount_config_t));
m_store_fat.format_if_mount_failed = true;
m_store_fat.max_files = 5;
esp_vfs_fat_spiflash_mount("/store", "store", &m_store_fat,
&m_store_wlh);
Debug
Logs
n/a
Other
items if
possible
Please advise
if you need
anything
further.
I think the
timeline is
correct (the
issue is in esp
idf master some
time between May
and July 2018),
but please let
me know if you
know differently
(or
update the
github issue
with your
comments).
Regards, Mark
On 23
Nov 2018, at
6:19 AM,
Michael Balzer
<dexter@expeedo.de
<mailto:dexter@expeedo.de>>
wrote:
esp-idf and
OVMS branches
are back to
the working
version.
In case you
also lost your
config: I
also just
fixed a bug on
restoring into
an empty
/store
partition.
Regards,
Michael
Am 22.11.18 um
22:34 schrieb
Michael
Balzer:
See https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/165
I'll reset
both master
branches now.
If you're
about to pull,
please wait
until I've
reverted the
branches.
Regards,
Michael
--
Michael Balzer
* Helkenberger
Weg 9 *
D-58256
Ennepetal
Fon 02333 /
833 5735 *
Handy 0176 /
206 989 26
_______________________________________________
OvmsDev
mailing list
OvmsDev@lists.openvehicles.com
<mailto:OvmsDev@lists.openvehicles.com>
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing
list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer *
Helkenberger Weg 9
* D-58256
Ennepetal
Fon 02333 / 833
5735 * Handy 0176
/ 206 989 26
_______________________________________________
OvmsDev mailing
list
OvmsDev@lists.openvehicles.com
<mailto:OvmsDev@lists.openvehicles.com>
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer *
Helkenberger Weg 9 *
D-58256 Ennepetal
Fon 02333 / 833 5735 *
Handy 0176 / 206 989
26
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer *
Helkenberger Weg 9 *
D-58256 Ennepetal
Fon 02333 / 833 5735 *
Handy 0176 / 206 989 26
--
Steve_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
OvmsDev mailing list
dmitry1945 commented 6 hours ago