Handling bad CSV records in CsvHelper

Zak picture Zak · Oct 20, 2017 · Viewed 12.2k times · Source

I would like to be able to iterate through all records in a CSV file and add all the good records to one collection and handle all the "bad" ones separately. I don't seem to be able to do this and I think I must be missing something.

If I attempt to catch the BadDataException then subsequent reads will fail meaning I cannot carry on and read the rest of the file -

while (true)
{
    try
    {
        if (!reader.Read())
            break;

        var record = reader.GetRecord<Record>();
        goodList.Add(record);
    }
    catch (BadDataException ex)
    {
        // Exception is caught but I won't be able to read further rows in file
        // (all further reader.Read() result in same exception thrown)
        Console.WriteLine(ex.Message);
    }
}

The other option discussed is setting the BadDataFound callback action to handle it -

reader.Configuration.BadDataFound = x =>
{
    Console.WriteLine($"Bad data: <{x.RawRecord}>");
};

However although the callback is called the bad record still ends up in my "good list"

Is there some way I can query the reader to see if the record is good before adding it to my list?

For this example my Record definition is -

class Record
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public int Age { get; set; }
}

And the data (first row bad, second row good) -

"Jo"hn","Doe",43
"Jane","Doe",21

Interestingly handling a missing field with MissingFieldException seems to function exactly as I would like - the exception is thrown but subsequent rows are still read ok.

Answer

Josh Close picture Josh Close · Oct 22, 2017

Here is the example I supplied.

void Main()
{
    using (var stream = new MemoryStream())
    using (var writer = new StreamWriter(stream))
    using (var reader = new StreamReader(stream))
    using (var csv = new CsvReader(reader))
    {
        writer.WriteLine("FirstName,LastName");
        writer.WriteLine("\"Jon\"hn\"\",\"Doe\"");
        writer.WriteLine("\"Jane\",\"Doe\"");
        writer.Flush();
        stream.Position = 0;

        var good = new List<Test>();
        var bad = new List<string>();
        var isRecordBad = false;
        csv.Configuration.BadDataFound = context =>
        {
            isRecordBad = true;
            bad.Add(context.RawRecord);
        };
        while (csv.Read())
        {
            var record = csv.GetRecord<Test>();
            if (!isRecordBad)
            {
                good.Add(record);
            }

            isRecordBad = false;
        }

        good.Dump();
        bad.Dump();
    }
}

public class Test
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
}