I’ve spent some time on this, and finally managed to
reliably repeat it (at least in one case) by:
- Connect an external HUD and ‘obdii ecu start
can3’.
- Once the HUD is connected and working, manually
change baud rate to incorrect ‘can can3 start active
250000’.
- Watch errors start streaming in.
- If I quickly switch back with ‘can can3 start
active 500000’, it recovers and everything is fine.
- If I leave it running, it seems to count up to
128 errors, and then lock up. At this point even a ‘can can3
start active 500000’ doesn’t solve it.
- A ‘power can3 off’ then ‘can can3 start active
500000’ recovers it.
Here is what it looks like in the failed state:
OVMS# can can3 status
CAN: can3
Mode: Active
Speed: 250000
Interrupts:
35901
Rx pkt:
0
Rx err:
128
Rx ovrflw:
0
Tx pkt:
0
Tx delays:
0
Tx err:
0
Tx ovrflw:
0
Err flags: 0x800b
D (697321) canlog:
Status can3 intr=35900 rxpkt=0 txpkt=0 errflags=0x800b
rxerr=128 txerr=0 rxovr=0 txovr=0 txdelay=0
Can you check to see what yours looks like next time
it fails?
Looking at the MCP2515 data sheet (page #45), it has
this to say:
6.6 Error States
Detected errors are made known to all other nodes via error
frames. The transmission of the erroneous mes- sage is
aborted and the frame is repeated as soon as possible.
Furthermore, each CAN node is in one of the three
error states according to the value of the internal error
counters:
1. Error-active.
2. Error-passive.
3. Bus-off (transmitter only).
The error-active state is the usual state where the node can
transmit messages and active error frames (made of dominant
bits) without any restrictions.
In the error-passive state, messages and passive
error frames (made of recessive bits) may be transmitted.
The bus-off state makes it temporarily impossible for the
station to participate in the bus communication. During this
state, messages can neither be received or transmitted. Only
transmitters can go bus-off.
6.7 Error Modes and Error Counters
The MCP2515 contains two error counters: the Receive Error
Counter (REC) (see Register 6-2) and the Transmit Error
Counter (TEC) (see Register 6-1). The values of both
counters can be read by the MCU. These counters
are incremented/decremented in accordance with the CAN bus
specification.
The MCP2515 is error-active if both error counters are below
the error-passive limit of 128.
It is error-passive if at least one of the error
counters equals or exceeds 128.
It goes to bus-off if the TEC exceeds the bus-off limit
of 255. The device remains in this state until the
bus-off recovery sequence is received. The bus-off
recovery sequence consists of 128 occurrences and 11
consec- utive recessive bits (see Figure 6-1).
The Current Error mode of the MCP2515 can be read by the MCU
via the EFLG register (see Register 6-3).
Additionally, there is an error state warning flag
bit (EFLG:EWARN) which is set if at least one of the
error counters equals or exceeds the error warning limit
of 96. EWARN is reset if both error counters are less
than the error warning limit.
I don’t think we access these TEC and REC registers, but
the 128 number cannot be a coincidence.
We do access the EFLG register, in our ISR, and here is
what I see:
E (685091) canlog:
Error can3 intr=30 rxpkt=0 txpkt=0 errflags=0x8000
rxerr=56 txerr=0 rxovr=0 txovr=0 txdelay=0
E (685091) canlog:
Error can3 intr=31 rxpkt=0 txpkt=0 errflags=0x8000
rxerr=58 txerr=0 rxovr=0 txovr=0 txdelay=0
E (685091) canlog:
Error can3 intr=32 rxpkt=0 txpkt=0 errflags=0x8000
rxerr=60 txerr=0 rxovr=0 txovr=0 txdelay=0
E (685091) canlog:
Error can3 intr=43 rxpkt=0 txpkt=0 errflags=0x8000
rxerr=81 txerr=0 rxovr=0 txovr=0 txdelay=0
E (685101) canlog:
Error can3 intr=60 rxpkt=0 txpkt=0 errflags=0x8003
rxerr=113 txerr=0 rxovr=0 txovr=0 txdelay=0
Lower 8bits of that is the EFLG, so 0x00 is normal, 0x03 is
when the error is hit, and 0x0b is what we see later.
Documentation for this flag is:
bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 bit 1 bit 0
R/W-0 R-0 R-0 R-0 R-0 R-0
bit#7: RX1OVR: Receive Buffer 1 Overflow Flag bit
- Set when a valid message is received for RXB1 and
CANINTF.RX1IF = 1 - Must be reset by MCU
bit#6: RX0OVR: Receive Buffer 0 Overflow Flag bit
- Set when a valid message is received for RXB0 and
CANINTF.RX0IF = 1
- Must be reset by MCU
bit#5: TXBO: Bus-Off Error Flag bit
- Bit set when TEC reaches 255
- Reset after a successful bus recovery sequence
bit#4: TXEP: Transmit Error-Passive Flag bit
- Set when TEC is equal to or greater than 128 - Reset when
TEC is less than 128
bit#3: RXEP: Receive Error-Passive Flag bit
- Set when REC is equal to or greater than 128
- Reset when REC is less than 128
bit#2: TXWAR: Transmit Error Warning Flag bit
- Set when TEC is equal to or greater than 96 - Reset when
TEC is less than 96
bit#1: RXWAR: Receive Error Warning Flag bit
- Set when REC is equal to or greater than 96 - Reset when REC
is less than 96
bit#0: EWARN: Error Warning Flag bit
- Set when TEC or REC is equal to or greater than 96 (TXWAR or
RXWAR = 1)
- Reset when both REC and TEC are less than 96
So that is EWARN+RXWAR when the 128 error issue occurs, and
EWARN+RXWAR+RXEP when everything is locked up. We have code to
clear the error condition (in the interrupt flags register),
but that doesn’t seem to get out of this 128 error lock-up.
I am not sure of the best approach for this. Perhaps pickup
the condition, and reset the SPI bus, in a timer every 10
seconds or so?
I am not sure if this is your problem (a ‘can can2 status’
would tell us). In any case, the fix for this is to pickup
this error condition in the ISR and fix it (or perhaps a
separate periodic timer).
Regards, Mark.
I haven't had a chance to try to work out
what is going on.
I can say that the second can interface doesn't work for
very long before stopping. This manifests most obviously
on my Leaf as a stopped odometer in the OVMS app. If you
look at the metrics in the console then everything that
comes from the Car CAN bus (ie the second CAN bus) has
frozen.
The first CAN interface seems much more reliable, with
SOC information from the EV bus being fairly reliably
reported.
I haven't done the modification to make my 3.0 unit's
GPS work so I haven't experienced the stolen detection.
On 05/07/18 18:34, Stein Arne Sordal wrote:
Did anyone figure out
what happens here?
Now the OVMS thinks my car is stolen since it´s moving
(GPS) and CAN2 is dead.
Reboot of module brings CAN2 back to life for a period
of time.
-Stein Arne Sordal-
On 11 May 2018, at
12:29, Stein Arne Sordal <ovms@topphemmelig.no>
wrote:
Hi Tom
I have seen this with my Leaf.
I´ve been on vacation, so I haven´t got time to test
a lot, but it looks like one of the can buses stops.
Started testing again today.
-Stein Arne Sordal-
On 11 May 2018, at
12:22, Tom Parker <tom@carrott.org>
wrote:
Hi all,
I synced up with master about a week ago and since
then I've seen both can busses stop working. I
still see the 12v battery metric changing, but
everything that comes from the car stops.
Rebooting the module with "module reset" does not
seem to fix it, while make app-flash monitor does
fix it. I haven't tried make monitor on it's own.
Is anyone else seeing behavior like this?
Sorry for the vague bug report. I'll spend some
time later this weekend to try to gather more
information.
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev