How to properly handle blank, null, or empty lines in C#

Drifter64 picture Drifter64 · Nov 29, 2015 · Viewed 18.2k times · Source

I have some C# code that is working with a text file, and i can not seem to get it to work properly with blank or null (whitespace) lines.

My code:

        while (!file.EndOfStream)
        {
            line = file.ReadLine();

            bool isComment = (line[0] == '/') && (line[1] == '/');
            bool isPoint = (line[0] == '(') && (line[line.Length - 1] == ')');
            bool isWhiteSpace = string.IsNullOrEmpty(line);

            Debug.Log("Comment: " + isComment + ", Point: " + isPoint + ", WhiteSpace: " + isWhiteSpace + "Value: '" + line + "'");

            if (!isComment && !isPoint && !isWhiteSpace) { Application.Quit(); }
            else if (isPoint)
            {
                //Strip parenthesis
                line = line.Remove(line.Length - 1, 1).Remove(0, 1);

                //break into float array
                string[] arr = line.Split(',');

                float xVal = float.Parse(arr[0]);
                float yVal = float.Parse(arr[1]);
                float zVal = float.Parse(arr[2]);

                Vector3 currentVector = new Vector3(xVal, yVal, zVal);
                results.Add(currentVector);
            }
        }

You can see that i happen to be doing things with Vector3. If the line is a comment line or a whitespace line, i want it to do nothing. If it notices parentheses, i want it to assume it is a Vector3 and parse it. Finally, if it is a line that is none of these, i want it to stop entirely. Here is a sample text file that i have created just with Notepad:

//This is a comment
// ... and so is this!
(0, -1.5, 3)
(1, 4, 1.23)

(3, 5, 2)

Notice that there is a gap between the second and third Vector3. In this particular case, the line is completely empty, it does not contain spaces or anything, i simply pressed [Enter][Enter] in Notepad. When my script reaches this line, it seems to trigger the file.EndOfStream boolean.... but its NOT the end of the file! How can i fix this? Is there a more appropriate condition for my while loop? I have also tried reading the line in and checking if it is null as the while condition, which is a more popular way to approach this, but this way of doing things also does not work for my case.

** Note: "file" is a variable of type StreamReader **

Answer

Corey picture Corey · Nov 30, 2015

This is more of a style note than an answer, although this also will prevent the issues you were seeing.

First, with a StreamReader when you call ReadLine you will only receive a null result when you reach the end of the file. You also don't care about whitespace at the start and end of your lines, and presumably don't care about lines that are entirely whitespace either. So you can use that to test for end of file and empty lines this way:

string line;
while ((line = file.ReadLine()) != null)
{
    line = line.Trim();
    if (line == "")
        continue;
}

Next you have some tests for start/end characters that is still going to cause problems in some situations. Specifically, reading the second character in a line that has only one character is going to cause an exception.

Instead of using indexing on a string of untested length you can use the StartsWith and EndsWith methods to do your tests:

bool isComment = line.StartsWith("//");
bool isPoint = line.StartsWith("(") && line.EndsWith(")");

Finally, in your code that parses the point value you assume that any line that starts with ( and ends with ) will have at least 2 commas in it, and that the text will correctly parse. This is a bad assumption.

The better way to handle all of this is to detect and deal with each case as you go, with the parse functionality broken out to a method you can reuse

Here's my version:

public class Program
{
    public static void Main()
    {
        List<Vector3> results = new List<Vector3>();
        using (var file = System.IO.File.OpenText(@"C:\temp\test.txt"))
        {
            string line;
            while ((line = file.ReadLine()?.Trim()) != null)
            {
                // skip empty lines and comments
                if (line == string.Empty || line.StartsWith("//"))
                    continue;
                // parse all other lines as vectors, exit program on error
                try
                {
                    Vector3 vector = ParseVector(line);
                    results.Add(vector);
                }
                catch (FormatException e)
                {
                    Console.WriteLine("Parse error on line: {0}", line);
                    throw;
                }
            }
        }

        foreach (var v in results)
            Console.WriteLine("({0},{1},{2})", v.X, v.Y, v.Z);
    }

    // parse string in format '(x,y,z)', all as floats
    // throws FormatException on any error
    public static Vector3 ParseVector(string text)
    {
        if (!text.StartsWith("(") || !text.EndsWith(")"))
            throw new FormatException();
        string[] parts = text.Substring(1, text.Length - 1).Split(',');
        if (parts.Length != 3)
            throw new FormatException();
        float x = float.Parse(parts[0]);
        float y = float.Parse(parts[1]);
        float z = float.Parse(parts[2]);
        return new Vector3(x, y, z);
    }
}

If you prefer not to use exceptions you could return null or use the pattern used by the TryParse methods, returning a boolean success/failure indicator and using an out parameter to write the results to. I prefer exceptions in this case.