[Ovmsdev] DBC and Endian-ness

Mark Webb-Johnson mark at webb-johnson.net
Mon Feb 11 10:55:29 HKT 2019


I’ve been going round and round in circles on this, most of the weekend, but not getting anywhere. I’ve rewritten the Decoder code four times so far, but just can’t get it to work. So calling for help here.

I attach the DBC specification. You can also find an extensive python library here:

https://github.com/eerimoq/cantools.git

(for anyone playing with DBC, that seems an excellent library and set of command line tools)

The seemingly simple problem that I am trying to solve is:

Given a CAN frame, we need to decode a particular signal. We want the integer value (signed or unsigned) out of it.

The signal species the start bit, size (number of bits), endian-ness (big or little), and value type (signed or unsigned). The DBC specification says:

The start_bit value specifies the position of the signal within the data field of the frame. For signals with byte order Intel (little endian) the position of the least-significant bit is given. For signals with byte order Motorola (big endian) the position of the most significant bit is given. The bits are counted in a saw-tooth manner.

byte_order = '0' | '1' (* 0=little endian, 1=big endian *)

The byte_format is 0 if the signal's byte order is Intel (little endian) or 1 if the byte order is Motorola (big endian).

There are also factor and offset values for signals, but that is irrelevant for the moment (trivial to apply once we have the actual integer raw signal value extracted) and can be ignored for the moment.

Wikipedia explains endian-ness:

In big-endian format, whenever addressing memory or sending/storing words bytewise, the most significant byte—the byte containing the most significant bit—is stored first (has the lowest address) or sent first, then the following bytes are stored or sent in decreasing significance order, with the least significant byte—the one containing the least significant bit—stored last (having the highest address) or sent last.

Little-endian format reverses this order: the sequence addresses/sends/stores the least significant byte first (lowest address) and the most significant byte last (highest address).

So, we have two strange things in the DBC specification: (1) the start bit is not always the first bit in the stream, and (2) this ’saw-tooth manner’ comment. So, I create a very simple DBC, and run it through the cantools dump to visualise it:

$ cat mark.dbc

BU_: WS200 MCN_Powertrain

BO_ 1 LittleEnd: 8 WS200
 SG_ OneByte : 7|8 at 0+ (1,0) [0|0] "" Vector__XXX
 SG_ TwoByte : 15|16 at 0+ (1,0) [0|0] "" Vector__XXX
 SG_ TwelveBit : 39|12 at 0+ (1,0) [0|0] "" Vector__XXX

$ cantools dump mark.dbc
================================= Messages =================================

  ------------------------------------------------------------------------

  Name:       LittleEnd
  Id:         0x1
  Length:     8 bytes
  Cycle time: - ms
  Senders:    WS200
  Layout:

                          Bit

             7   6   5   4   3   2   1   0
           +---+---+---+---+---+---+---+---+
         0 |<-----------------------------x|
           +---+---+---+---+---+---+---+---+
                                         +-- OneByte
           +---+---+---+---+---+---+---+---+
         1 |<------------------------------|
           +---+---+---+---+---+---+---+---+
         2 |------------------------------x|
           +---+---+---+---+---+---+---+---+
     B                                   +-- TwoByte
     y     +---+---+---+---+---+---+---+---+
     t   3 |   |   |   |   |   |   |   |   |
     e     +---+---+---+---+---+---+---+---+
         4 |<------------------------------|
           +---+---+---+---+---+---+---+---+
         5 |--------------x|   |   |   |   |
           +---+---+---+---+---+---+---+---+
                         +-- TwelveBit
           +---+---+---+---+---+---+---+---+
         6 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         7 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+

  Signal tree:

    -- {root}
       +-- OneByte
       +-- TwoByte
       +-- TwelveBit

  ------------------------------------------------------------------------

I also used 'cantools generate_c_source mark.dbc’ to generate a C decoder for this, and this is what it looks like:

int mark_little_end_unpack(
    struct mark_little_end_t *dst_p,
    const uint8_t *src_p,
    size_t size)
{
    if (size < 8u) {
        return (-EINVAL);
    }

    memset(dst_p, 0, sizeof(*dst_p));

    dst_p->one_byte |= unpack_right_shift_u8(src_p[0], 0u, 0xffu);
    dst_p->two_byte |= unpack_left_shift_u16(src_p[1], 8u, 0xffu);
    dst_p->two_byte |= unpack_right_shift_u16(src_p[2], 0u, 0xffu);
    dst_p->twelve_bit |= unpack_left_shift_u16(src_p[4], 4u, 0xffu);
    dst_p->twelve_bit |= unpack_right_shift_u16(src_p[5], 4u, 0xf0u);

    return (0);
}

Let’s look at that TwoByte value. This is little endian, so the least significant byte should come first in p[1] followed by the most significant in p[2]. So, why does the generator produce:

    dst_p->two_byte |= unpack_left_shift_u16(src_p[1], 8u, 0xffu);
    dst_p->two_byte |= unpack_right_shift_u16(src_p[2], 0u, 0xffu);

From what I can see, that means take p[1] and shift it left 8, then add on p[2]. That seems big endian, not little endian?

Let’s try a big endian example:

BO_ 3 BigEnd2: 8 WS200
 SG_ EightBit2 : 5|8 at 1+ (1,0) [0|0] "" Vector__XXX
 SG_ OneByte2 : 16|8 at 1+ (1,0) [0|0] "" Vector__XXX
 SG_ TwelveBit2 : 48|11 at 1+ (1,0) [0|0] "” Vector__XXX

                          Bit

             7   6   5   4   3   2   1   0
           +---+---+---+---+---+---+---+---+
         0 |----------x|   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         1 |   |   |   |<------------------|
           +---+---+---+---+---+---+---+---+
                         +-- EightBit2
           +---+---+---+---+---+---+---+---+
         2 |<-----------------------------x|
           +---+---+---+---+---+---+---+---+
     B       +-- OneByte2
     y     +---+---+---+---+---+---+---+---+
     t   3 |   |   |   |   |   |   |   |   |
     e     +---+---+---+---+---+---+---+---+
         4 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         5 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         6 |------------------------------x|
           +---+---+---+---+---+---+---+---+
         7 |   |   |   |   |   |<----------|
           +---+---+---+---+---+---+---+---+
                                 +-- TwelveBit2

Notice the weird alignment of EightBit2 and TwelveBit2? I presume that is the ’saw-tooth’ pattern talked about in the DBC specification.

A Tesla Model S encodes values like the battery temperature in CAN ID 0x102, and we decode this in C like this:

StandardMetrics.ms_v_bat_temp->SetValue((float)((((int)d[7]&0x07)<<8)+d[6])/10);

So that is little endian (d[7] is shifted right, followed by d[6] normally - so d[7] is the MSB and d[6] the LSB). But how to represent that in DBC?

BO_ 1 LittleEnd2: 8 WS200
 SG_ TwoByte : 47|11 at 0+ (1,0) [0|0] "” Vector__XXX

                          Bit

             7   6   5   4   3   2   1   0
           +---+---+---+---+---+---+---+---+
         0 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         1 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         2 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
     B   3 |   |   |   |   |   |   |   |   |
     y     +---+---+---+---+---+---+---+---+
     t   4 |   |   |   |   |   |   |   |   |
     e     +---+---+---+---+---+---+---+---+
         5 |<------------------------------|
           +---+---+---+---+---+---+---+---+
         6 |----------x|   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
                     +-- TwoByte
           +---+---+---+---+---+---+---+---+
         7 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+—+

    dst_p->two_byte |= unpack_left_shift_u16(src_p[5], 3u, 0xffu);
    dst_p->two_byte |= unpack_right_shift_u16(src_p[6], 5u, 0xe0u);

That doesn’t work. How about big endian?

BO_ 1 BigEnd2: 8 WS200
 SG_ TwoByte : 48|11 at 1+ (1,0) [0|0] "” Vector__XXX

                          Bit

             7   6   5   4   3   2   1   0
           +---+---+---+---+---+---+---+---+
         0 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         1 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         2 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
     B   3 |   |   |   |   |   |   |   |   |
     y     +---+---+---+---+---+---+---+---+
     t   4 |   |   |   |   |   |   |   |   |
     e     +---+---+---+---+---+---+---+---+
         5 |   |   |   |   |   |   |   |   |
           +---+---+---+---+---+---+---+---+
         6 |------------------------------x|
           +---+---+---+---+---+---+---+---+
         7 |   |   |   |   |   |<----------|
           +---+---+---+---+---+---+---+---+
                                 +— TwoByte

    dst_p->two_byte |= unpack_right_shift_u16(src_p[6], 0u, 0xffu);
    dst_p->two_byte |= unpack_left_shift_u16(src_p[7], 8u, 0x07u);

That seems correct.

Looking in the source for cantools dbc (cantools/cantools/database/can/formats/dbc.py), I see:

byte_order=(0 if signal.byte_order == 'big_endian' else 1),

This code also says the same thing (0 = big endian):

https://github.com/julietkilo/CANBabel/blob/master/src/main/java/com/github/canbabel/canio/dbc/DbcReader.java

boolean isBigEndian = "0".equals(splitted[2]);

WTF?

There is this (from googling), which makes things clear as mud:

https://github.com/ebroecker/canmatrix/wiki/signal-Byteorder
https://se.mathworks.com/help/vnt/ug/canpack.html
https://github.com/ebroecker/SocketCandecodeSignals/blob/master/datenbasis.c#L112-L122
https://github.com/julietkilo/CANBabel/blob/master/src/main/java/com/github/canbabel/canio/dbc/DbcReader.java

My conclusions are that:

The DBC specification is incorrect, in that 0 is big endian, and 1 is little endian in the public code we see.

and/or

The DBC specification is incorrect, in that all CAN bus byte ordering is little endian, and the byte_order setting only affects the interpretation of the bit_ordering, and saw-tooth decode/encode.

But that really makes no sense to me. This is supposed to be the defacto standard for automotive CAN bus signal definition - how can it be such a mess?

Help?

Mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20190211/d3962efb/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 336696867-DBC-Format-2007.pdf
Type: application/pdf
Size: 194343 bytes
Desc: not available
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20190211/d3962efb/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20190211/d3962efb/attachment-0003.html>


More information about the OvmsDev mailing list