I don't have much experience with Serial I/O, but have recently been tasked with fixing some highly flawed serial code, because the original programmer has left the company.
The application is a Windows program that talks to a scientific instrument serially via a virtual COMM port running on USB. Virtual COMM port USB drivers are provided by FTDI, since they manufacture the USB chip we use on the instrument.
The serial code is in an unmanaged C++ DLL, which is shared by both our old C++ software, and our new C# / .Net (WinForms) software.
There are two main problems:
Fails on many XP systems
When the first command is sent to the instrument, there's no response. When you issue the next command, you get the response from the first one.
Here's a typical usage scenario (full source for methods called is included below):
char szBuf [256];
CloseConnection ();
if (OpenConnection ())
{
ClearBuffer ();
// try to get a firmware version number
WriteChar ((char) 'V');
BOOL versionReadStatus1 = ReadString (szBuf, 100);
...
}
On a failing system, the ReadString call will never receive any serial data, and times out. But if we issue another, different command, and call ReadString again, it will return the response from the first command, not the new one!
But this only happens on a large subset of Windows XP systems - and never on Windows 7. As luck would have it, our XP dev machines worked OK, so we did not see the problem until we started beta testing. But I can also reproduce the problem by running an XP VM (VirtualBox) on my XP dev machine. Also, the problem only occurs when using the DLL with the new C# version - works fine with the old C++ app.
This seemed to be resolved when I added a Sleep(21) to the low level BytesInQue method before calling ClearCommError, but this exacerbated the other problem - CPU usage. Sleeping for less than 21 ms would make the failure mode reappear.
High CPU usage
When doing serial I/O CPU use is excessive - often above 90%. This happens with both the new C# app and the old C++ app, but is much worse in the new app. Often makes the UI very non-responsive, but not always.
Here's the code for our Port.cpp class, in all it's terrible glory. Sorry for the length, but this is what I'm working with. Most important methods are probably OpenConnection, ReadString, ReadChar, and BytesInQue.
//
// Port.cpp: Implements the CPort class, which is
// the class that controls the serial port.
//
// Copyright (C) 1997-1998 Microsoft Corporation
// All rights reserved.
//
// This source code is only intended as a supplement to the
// Broadcast Architecture Programmer's Reference.
// For detailed information regarding Broadcast
// Architecture, see the reference.
//
#include <windows.h>
#include <stdio.h>
#include <assert.h>
#include "port.h"
// Construction code to initialize the port handle to null.
CPort::CPort()
{
m_hDevice = (HANDLE)0;
// default parameters
m_uPort = 1;
m_uBaud = 9600;
m_uDataBits = 8;
m_uParity = 0;
m_uStopBits = 0; // = 1 stop bit
m_chTerminator = '\n';
m_bCommportOpen = FALSE;
m_nTimeOut = 50;
m_nBlockSizeMax = 2048;
}
// Destruction code to close the connection if the port
// handle was valid.
CPort::~CPort()
{
if (m_hDevice)
CloseConnection();
}
// Open a serial communication port for writing short
// one-byte commands, that is, overlapped data transfer
// is not necessary.
BOOL CPort::OpenConnection()
{
char szPort[64];
m_bCommportOpen = FALSE;
// Build the COM port string as "COMx" where x is the port.
if (m_uPort > 9)
wsprintf(szPort, "\\\\.\\COM%d", m_uPort);
else
wsprintf(szPort, "COM%d", m_uPort);
// Open the serial port device.
m_hDevice = CreateFile(szPort,
GENERIC_WRITE | GENERIC_READ,
0,
NULL, // No security attributes
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (m_hDevice == INVALID_HANDLE_VALUE)
{
SaveLastError ();
m_hDevice = (HANDLE)0;
return FALSE;
}
return SetupConnection(); // After the port is open, set it up.
} // end of OpenConnection()
// Configure the serial port with the given settings.
// The given settings enable the port to communicate
// with the remote control.
BOOL CPort::SetupConnection(void)
{
DCB dcb; // The DCB structure differs betwwen Win16 and Win32.
dcb.DCBlength = sizeof(DCB);
// Retrieve the DCB of the serial port.
BOOL bStatus = GetCommState(m_hDevice, (LPDCB)&dcb);
if (bStatus == 0)
{
SaveLastError ();
return FALSE;
}
// Assign the values that enable the port to communicate.
dcb.BaudRate = m_uBaud; // Baud rate
dcb.ByteSize = m_uDataBits; // Data bits per byte, 4-8
dcb.Parity = m_uParity; // Parity: 0-4 = no, odd, even, mark, space
dcb.StopBits = m_uStopBits; // 0,1,2 = 1, 1.5, 2
dcb.fBinary = TRUE; // Binary mode, no EOF check : Must use binary mode in NT
dcb.fParity = dcb.Parity == 0 ? FALSE : TRUE; // Enable parity checking
dcb.fOutX = FALSE; // XON/XOFF flow control used
dcb.fInX = FALSE; // XON/XOFF flow control used
dcb.fNull = FALSE; // Disable null stripping - want nulls
dcb.fOutxCtsFlow = FALSE;
dcb.fOutxDsrFlow = FALSE;
dcb.fDsrSensitivity = FALSE;
dcb.fDtrControl = DTR_CONTROL_ENABLE;
dcb.fRtsControl = RTS_CONTROL_DISABLE ;
// Configure the serial port with the assigned settings.
// Return TRUE if the SetCommState call was not equal to zero.
bStatus = SetCommState(m_hDevice, &dcb);
if (bStatus == 0)
{
SaveLastError ();
return FALSE;
}
DWORD dwSize;
COMMPROP *commprop;
DWORD dwError;
dwSize = sizeof(COMMPROP) + sizeof(MODEMDEVCAPS) ;
commprop = (COMMPROP *)malloc(dwSize);
memset(commprop, 0, dwSize);
if (!GetCommProperties(m_hDevice, commprop))
{
dwError = GetLastError();
}
m_bCommportOpen = TRUE;
return TRUE;
}
void CPort::SaveLastError ()
{
DWORD dwLastError = GetLastError ();
LPVOID lpMsgBuf;
FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS,
NULL,
dwLastError,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language
(LPTSTR) &lpMsgBuf,
0,
NULL);
strcpy (m_szLastError,(LPTSTR)lpMsgBuf);
// Free the buffer.
LocalFree( lpMsgBuf );
}
void CPort::SetTimeOut (int nTimeOut)
{
m_nTimeOut = nTimeOut;
}
// Close the opened serial communication port.
void CPort::CloseConnection(void)
{
if (m_hDevice != NULL &&
m_hDevice != INVALID_HANDLE_VALUE)
{
FlushFileBuffers(m_hDevice);
CloseHandle(m_hDevice); ///that the port has been closed.
}
m_hDevice = (HANDLE)0;
// Set the device handle to NULL to confirm
m_bCommportOpen = FALSE;
}
int CPort::WriteChars(char * psz)
{
int nCharWritten = 0;
while (*psz)
{
nCharWritten +=WriteChar(*psz);
psz++;
}
return nCharWritten;
}
// Write a one-byte value (char) to the serial port.
int CPort::WriteChar(char c)
{
DWORD dwBytesInOutQue = BytesInOutQue ();
if (dwBytesInOutQue > m_dwLargestBytesInOutQue)
m_dwLargestBytesInOutQue = dwBytesInOutQue;
static char szBuf[2];
szBuf[0] = c;
szBuf[1] = '\0';
DWORD dwBytesWritten;
DWORD dwTimeOut = m_nTimeOut; // 500 milli seconds
DWORD start, now;
start = GetTickCount();
do
{
now = GetTickCount();
if ((now - start) > dwTimeOut )
{
strcpy (m_szLastError, "Timed Out");
return 0;
}
WriteFile(m_hDevice, szBuf, 1, &dwBytesWritten, NULL);
}
while (dwBytesWritten == 0);
OutputDebugString(TEXT(strcat(szBuf, "\r\n")));
return dwBytesWritten;
}
int CPort::WriteChars(char * psz, int n)
{
DWORD dwBytesWritten;
WriteFile(m_hDevice, psz, n, &dwBytesWritten, NULL);
return dwBytesWritten;
}
// Return number of bytes in RX queue
DWORD CPort::BytesInQue ()
{
COMSTAT ComStat ;
DWORD dwErrorFlags;
DWORD dwLength;
// check number of bytes in queue
ClearCommError(m_hDevice, &dwErrorFlags, &ComStat ) ;
dwLength = ComStat.cbInQue;
return dwLength;
}
DWORD CPort::BytesInOutQue ()
{
COMSTAT ComStat ;
DWORD dwErrorFlags;
DWORD dwLength;
// check number of bytes in queue
ClearCommError(m_hDevice, &dwErrorFlags, &ComStat );
dwLength = ComStat.cbOutQue ;
return dwLength;
}
int CPort::ReadChars (char* szBuf, int nMaxChars)
{
if (BytesInQue () == 0)
return 0;
DWORD dwBytesRead;
ReadFile(m_hDevice, szBuf, nMaxChars, &dwBytesRead, NULL);
return (dwBytesRead);
}
// Read a one-byte value (char) from the serial port.
int CPort::ReadChar (char& c)
{
static char szBuf[2];
szBuf[0] = '\0';
szBuf[1] = '\0';
if (BytesInQue () == 0)
return 0;
DWORD dwBytesRead;
ReadFile(m_hDevice, szBuf, 1, &dwBytesRead, NULL);
c = *szBuf;
if (dwBytesRead == 0)
return 0;
return dwBytesRead;
}
BOOL CPort::ReadString (char *szStrBuf , int nMaxLength)
{
char str [256];
char str2 [256];
DWORD dwTimeOut = m_nTimeOut;
DWORD start, now;
int nBytesRead;
int nTotalBytesRead = 0;
char c = ' ';
static char szCharBuf [2];
szCharBuf [0]= '\0';
szCharBuf [1]= '\0';
szStrBuf [0] = '\0';
start = GetTickCount();
while (c != m_chTerminator)
{
nBytesRead = ReadChar (c);
nTotalBytesRead += nBytesRead;
if (nBytesRead == 1 && c != '\r' && c != '\n')
{
*szCharBuf = c;
strncat (szStrBuf,szCharBuf,1);
if (strlen (szStrBuf) == nMaxLength)
return TRUE;
// restart timer for next char
start = GetTickCount();
}
// check for time out
now = GetTickCount();
if ((now - start) > dwTimeOut )
{
strcpy (m_szLastError, "Timed Out");
return FALSE;
}
}
return TRUE;
}
int CPort::WaitForQueToFill (int nBytesToWaitFor)
{
DWORD start = GetTickCount();
do
{
if (BytesInQue () >= nBytesToWaitFor)
break;
if (GetTickCount() - start > m_nTimeOut)
return 0;
} while (1);
return BytesInQue ();
}
int CPort::BlockRead (char * pcInputBuffer, int nBytesToRead)
{
int nBytesRead = 0;
int charactersRead;
while (nBytesToRead >= m_nBlockSizeMax)
{
if (WaitForQueToFill (m_nBlockSizeMax) < m_nBlockSizeMax)
return nBytesRead;
charactersRead = ReadChars (pcInputBuffer, m_nBlockSizeMax);
pcInputBuffer += charactersRead;
nBytesRead += charactersRead;
nBytesToRead -= charactersRead;
}
if (nBytesToRead > 0)
{
if (WaitForQueToFill (nBytesToRead) < nBytesToRead)
return nBytesRead;
charactersRead = ReadChars (pcInputBuffer, nBytesToRead);
nBytesRead += charactersRead;
nBytesToRead -= charactersRead;
}
return nBytesRead;
}
Based on my testing and reading, I see several suspicious things in this code:
COMMTIMEOUTS is never set. MS docs say "Unpredictable results can occur if you fail to set the time-out values". But I tried setting this, and it didn't help.
Many methods (e.g. ReadString) will go into a tight loop and hammer the port with repeated reads if they don't get data immediately . This seems to explain the high CPU usage.
Many methods have their own timeout handling, using GetTickCount(). Isn't that what COMMTIMEOUTS is for?
In the new C# (WinForms) program, all these serial routines are called directly from the main thread, from a MultiMediaTimer event. Maybe should be run in a different thread?
BytesInQue method seems to be a bottleneck. If I break to debugger when CPU usage is high, that's usually where the program stops. Also, adding a Sleep(21) to this method before calling ClearCommError seems to resolve the XP problem, but exacerbates the CPU usage problem.
Code just seems unnecessarily complicated.
My Questions
Can anyone explain why this only works with a C# program on a small number of XP systems?
Any suggestions on how to rewrite this? Pointers to good sample code would be most welcome.
There are some serious problems with that class and it makes things even worse that there is a Microsoft copyright on it.
There is nothing special about this class. And it makes me wonder why it even exists except as an Adapter over Create/Read/WriteFile. You wouldnt even need this class if you used the SerialPort class in the .NET Framework.
Your CPU usage is because the code goes into an infinite loop while waiting for the device to have enough available data. The code might as well say while(1);
If you must stick with Win32 and C++ you can look into Completion Ports and setting the OVERLAPPED flag when invoking CreateFile. This way you can wait for data in a separate worker thread.
You need to be careful when communicating to multiple COM ports. It has been a long time since I've done C++ but I believe the static buffer szBuff in the Read and Write methods is static for ALL instances of that class. It means if you invoke Read against two different COM ports "at the same time" you will have unexpected results.
As for the problems on some of the XP machines, you will most certainly figure out the problem if you check GetLastError after each Read/Write and log the results. It should be checking GetLastError anyways as it sometimes isn't always an "error" but a request from the subsystem to do something else in order to get the result you want.
You can get rid of the the whole while loop for blocking if you set COMMTIMEOUTS
correctly. If there is a specific timeout for a Read
operation use SetCommTimeouts
before you perform the read.
I set ReadIntervalTimeout
to the max timeout to ensure that the Read won't return quicker than m_nTimeOut. This value will cause Read to return if the time elapses between any two bytes. If it was set to 2 milliseconds and the first byte came in at t, and the second came in at t+1, the third at t+4, ReadFile would of only returned the first two bytes since the interval between the bytes was surpassed. ReadTotalTimeoutConstant ensures that you will never wait longer than m_nTimeOut no matter what.
maxWait = BytesToRead * ReadTotalTimeoutMultiplier + ReadTotalTimeoutConstant
. Thus (BytesToRead * 0) + m_nTimeout = m_nTimeout
BOOL CPort::SetupConnection(void)
{
// Snip...
COMMTIMEOUTS comTimeOut;
comTimeOut.ReadIntervalTimeout = m_nTimeOut; // Ensure's we wait the max timeout
comTimeOut.ReadTotalTimeoutMultiplier = 0;
comTimeOut.ReadTotalTimeoutConstant = m_nTimeOut;
comTimeOut.WriteTotalTimeoutMultiplier = 0;
comTimeOut.WriteTotalTimeoutConstant = m_nTimeOut;
SetCommTimeouts(m_hDevice,&comTimeOut);
}
// If return value != nBytesToRead check check GetLastError()
// Most likely Read timed out.
int CPort::BlockRead (char * pcInputBuffer, int nBytesToRead)
{
DWORD dwBytesRead;
if (FALSE == ReadFile(
m_hDevice,
pcInputBuffer,
nBytesToRead,
&dwBytesRead,
NULL))
{
// Check GetLastError
return dwBytesRead;
}
return dwBytesRead;
}
I have no idea if this is completely correct but it should give you an idea. Remove the ReadChar and ReadString methods and use this if your program relies on things being synchronous. Be careful about setting high time outs also. Communications are fast, in the milliseconds.