Parsing Strings in SuperCSV

Davidson picture Davidson · Nov 28, 2011 · Viewed 8.2k times · Source

@Carlo V. Dango I have simplified my question and I have read the documentation--good advice not to panic. Still, I have problems. Help me solve one and it will solve them all. Thank you.

Question: When I have a csv record that is missing a non-String field, how (or even can I) convert the missing entry to a default value, or at least, not throw NullPointerException? Optional cellProcessor doesn't appear to prevent the error either.

This the program taken essentially from the SuperCSV website.

package com.test.csv;
import java.io.FileReader;

import org.supercsv.cellprocessor.ParseBigDecimal;
import org.supercsv.cellprocessor.ParseDate;
import org.supercsv.cellprocessor.ParseInt;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanReader;
import org.supercsv.io.ICsvBeanReader;
import org.supercsv.prefs.CsvPreference;


public class CSVReader {

private static final CellProcessor[] cellProcessor = new CellProcessor[] {
    null,
    null,
    new ParseInt(),
    new ParseDate("yyyyMMdd"),      
    new ParseBigDecimal()       
};

public static void main (String[] args ) throws Exception {

    CsvPreference pref = new CsvPreference('"', '|', "\n");

    ICsvBeanReader inFile = new CsvBeanReader(new FileReader("C:\\temp\\sapfilePipe.txt"), pref);
    try {
        final String[] header = inFile.getCSVHeader(true);
        User user;
        while ((user = inFile.read(User.class, header, cellProcessor)) != null) {
            System.out.println(user);
        }
    } finally {
        inFile.close();
    }

}

}

here is the CSV file I'm reading. Notice in the first record there is a missing field (age).

firstName|lastName|age|hireDate|hourlyRate
A.|Smith|  |20110101|15.50

My User bean:

package com.test.csv;

import java.math.BigDecimal;
import java.util.Date;

public class User {

private String firstName;
private String lastName;
private int age;
private Date hireDate;
private BigDecimal hourlyRate;
    ...getters/setters...   

Here is the error:

Exception in thread "main" java.lang.NullPointerException
    at org.supercsv.io.CsvBeanReader.fillObject(Unknown Source)
    at org.supercsv.io.CsvBeanReader.read(Unknown Source)
    at com.glazers.csv.CSVReader.main(CSVReader.java:31)

Thanks.

Answer

James Bassett picture James Bassett · Feb 12, 2012

Edit: Update for Super CSV 2.0.0-beta-1

Super CSV 2.0.0-beta-1 is out now. It includes many bug fixes and new features (including Maven support and a new Dozer extension for mapping nested properties and arrays/Collections).

It has also changed the way empty ("") columns are treated - they are now read as null. This means that the firstName and lastName fields in your bean will now be null instead of "" if they are not present in the CSV file.

The Optional() processor has been updated to cater for this - so it will still function the same way.

My suggestion of using Token is no longer relevant: you should use ConvertNullTo instead:

new ConvertNullTo(-1, new ParseInt())

What you really want is the Optional CellProcessor, which will only allow the next processor in the chain to execute if the column isn't empty.

So update your CellProcessor array to:

private static final CellProcessor[] cellProcessor = new CellProcessor[] {
    null,
    null,
    new Optional(new ParseInt()),
    new ParseDate("yyyyMMdd"),      
    new ParseBigDecimal()       
};

That way, ParseInt will only get executed if the column is not blank (CellProcessors execute from left to right), leaving the int field in the bean with its default value of 0.

If you wanted to set the field to -1 to indicate that no value was supplied, then you could use the Token processor, which will replace any token ("" in this case) with a desired value, for any other input it will continue to the next processor. i.e.

new Token("", -1, new ParseInt())

@Carlo V. Dango the CsvListReader is a very primitive implementation (and you lose the ability to map to beans) so I'd only use it for quick and dirty parsing.

And I'd only recommend using null in the array (when reading) for String properties that require no further processing.

By the way, I'm on the Super CSV project working for an upcoming release. I'll be sure to improve the code examples on the website while I'm at it ;)