How are flags represented in the termios library?

TolkienWASP picture TolkienWASP · Aug 14, 2015 · Viewed 7.9k times · Source

I'm new to C and driver programming. Currently, I'm programming a user space driver to communicate with RS232 over USB using Debian. While researching, I came across the following bit of code.

tty.c_cflag     &=  ~PARENB;            // No Parity
tty.c_cflag     &=  ~CSTOPB;            // 1 Stop Bit
tty.c_cflag     &=  ~CSIZE;
tty.c_cflag     |=  CS8;                // 8 Bits

I understand the consequences of these lines, however, these operations would only make sense if each control flag constant (PARENB, CSTOPB, etc.) was the same length of a combination of these flags. I can't seem to verify this through any documentation (one of my main grievances with C thus far, somewhat harder to find easy to understand documentation.) to confirm this.

I would like to ensure that I'm understanding the program correctly, as it's a purely inductive approach and I'm unsure as to why these flags would be stored as such. Could somebody verify these findings, or point out something I may be overlooking?

Ex.

tty.c_cflag hypothetically is 4-bits long, each of the flags from the
previous code block corresponding to bits 3, 2, 1, 0. Then I believe the
following is how these are stored, if we were to say flags PARENB (3) and
CSTOPB (2) are high, and the other two flags are disabled. 

tty.c_cflag = 1100
PARENB = 1000
CSTOPB = 0100
CSIZE = 0000
CS8 = 0000

Answer

zmo picture zmo · Aug 14, 2015

In C, the best documentation you'll ever find is the source code itself, which you can find on your computer at /usr/include/termios.h (actually spread over one or more of the includes within it) — here's the bsd based termios.h for apples I based my answer on, values are likely to change depending on your flavour of Unix.

There, you'll find out that your tty object is of type struct termios, defined as follows:

struct termios {
    tcflag_t    c_iflag;    /* input flags */
    tcflag_t    c_oflag;    /* output flags */
    tcflag_t    c_cflag;    /* control flags */
    tcflag_t    c_lflag;    /* local flags */
    cc_t        c_cc[NCCS]; /* control chars */
    speed_t     c_ispeed;   /* input speed */
    speed_t     c_ospeed;   /* output speed */
};

So c_cflag is of type tcflag_t, which is defined by the following line:

typedef unsigned long   tcflag_t;

And an unsigned long is expected to be 4 bytes, i.e. 32bits.

Then all the flags you're using in your example are being defined as follows; using 8 bytes values:

#define PARENB      0x00001000  /* parity enable */
#define CSTOPB      0x00000400  /* send 2 stop bits */
#define CSIZE       0x00000300  /* character size mask */
#define CS8         0x00000300      /* 8 bits */

That being said, the way it works is that c_cflag is used as a bit array, meaning that each bit is significant for a function. This is a method commonly used because bit operations are "cheap" in processing power (your CPU can do a bit operation in one cycle), and "cheap" in memory space, as instead of using an array of 32 booleans to store values (a boolean type having a size of 1 byte to store one binary value), you're able to store 8 binary values per byte.

Another advantage, and optimization, is that because your CPU is at least 32-bits, and likely to be 64-bits in 2015, it can apply a mask over the 32 values in one CPU cycle.

An alternative representation of the bitmask would be to create a struct like the following:

struct tcflag_t {
    bool cignore;
    uint8_t csize;
    bool cstopb;
    bool cread;
    bool parenb;
    bool hupcl;
    bool clocal;
    bool ccts_oflow;
    bool crts_iflow;
    bool cdtr_iflow;
    bool ctdr_oflow;
    bool ccar_oflow;
};

Which would be 12 bytes. And to change them, you'd have to do 12 operations.

Then the operations you can do on bytes follows the boolean logic, which is defined by truth tables:

The And (&), Or (|) and Not (~) truth tables:

| a | b | & |    | a | b | | |    | a | ~ |
| - | - | - |    | - | - | - |    | 0 | 1 |
| 0 | 0 | 0 |    | 0 | 0 | 0 |    | 1 | 0 |
| 0 | 1 | 0 |    | 0 | 1 | 1 |
| 1 | 0 | 0 |    | 1 | 0 | 1 |
| 1 | 1 | 1 |    | 1 | 1 | 1 |

We usually nickname the And operator as "force to zero" and the Or operator as "force to 1", because unless both values are 1, the And will result in 0, and unless both values are 0, the Or will result in 1.

So if we consider that tty.c_cflag = 0x00000000 and you want to enable the parity check:

tty.c_cflag |= PARENB;

and then tty.c_cflag will contain 0x00001000, i.e. 0b1000000000000

Then you want to setup 7 bit size:

tty.c_cflag |= CS7;

and tty.c_cflag will contain 0x00001200, i.e. 0b1001000000000


Now, let's get back to your question: your "equivalent" example is not really representative, as you're considering CSIZE and CS8 to contain no value.

So let's get through the code you've taken from the example:

tty.c_cflag     &=  ~PARENB;            // No Parity
tty.c_cflag     &=  ~CSTOPB;            // 1 Stop Bit
tty.c_cflag     &=  ~CSIZE;
tty.c_cflag     |=  CS8;                // 8 Bits

Here, tty.c_cflag contains an unknown value:

0b????????????????????????????????

And you know you want no parity, one stop bit, and a data size of 8 bits. So here you're negating the "set parity" value to turn it off:

~PARENB == 0b0111111111111

And then using the And operator, you're forcing the bit to zero:

tty.c_cflag &= ~PARENB —→ 0b???????????????????0????????????

Then you do the same with CSTOPB:

tty.c_cflag &= ~CSTOPB —→ 0b???????????????????0?0??????????

and finally CSIZE:

tty.c_cflag &= ~CSIZE  —→ 0b???????????????????0?000????????

For CSIZE, the goal is to make sure the two bit values for the length of data is reset. Then you set up the right length by forcing to 1 the value:

tty.c_cflag |= CS8     —→ 0b???????????????????0?011????????

Actually, resetting CSIZE to 00 and then setting up CS8 to 11 is useless, as doing directly tty.c_cflag |= CS8 will make it 11. But this is good practice in case you want to change from CS8 to CS7, which will then set only one of the two bits, the other one staying at the original value.

Finally, when you'll open your serial port, the library will check for those values to configure the port, and use defaults for all the other values you haven't forced and you'll be able to use your serial port.

I hope my explanation is helping you to better understand what's going on with flag settings on the serial port, and the use of bitmasks altogether. FYI, the same principle is being used for a lot of other things, like for example IPv4 netmasks, file I/O, etc.