How to parse a CSV file that might have one of two delimiters?

Coder1224 picture Coder1224 · Aug 12, 2015 · Viewed 11.9k times · Source

In my case, valid CSV are ones delimited by either comma or semi-colon. I am open to other libraries, but it needs to be Java. Reading through the Apache CSVParser API, the only thing I can think is to do this which seems inefficient and ugly.

try
{
   BufferedReader reader = new BufferedReader(new InputStreamReader(file));
   CSVFormat csvFormat = CSVFormat.EXCEL.withHeader().withDelimiter(';');
   CSVParser parser = csvFormat.parse( reader );
   // now read the records
} 
catch (IOException eee) 
{
   try
   {
      // try the other valid delimeter
      csvFormat = CSVFormat.EXCEL.withHeader().withDelimiter(',');
      parser = csvFormat.parse( reader );
      // now read the records
   }
   catch (IOException eee) 
   {
      // then its really not a valid CSV file
   }
}

Is there a way to check the delimiter first, or perhaps allow two delimiters? Anyone have a better idea than just catching an exception?

Answer

Jeronimo Backes picture Jeronimo Backes · Aug 12, 2015

We built support for this in uniVocity-parsers:

public static void main(String... args) {
    CsvParserSettings settings = new CsvParserSettings();
    settings.setDelimiterDetectionEnabled(true);

    CsvParser parser = new CsvParser(settings);

    List<String[]> rows = parser.parseAll(file);

}

The parser has many more features that I'm sure you will find useful. Give it a try.

Disclaimer: I'm the author of this library, it's open source and free (apache 2.0 license)