In my case, valid CSV are ones delimited by either comma or semi-colon. I am open to other libraries, but it needs to be Java. Reading through the Apache CSVParser API, the only thing I can think is to do this which seems inefficient and ugly.
try
{
BufferedReader reader = new BufferedReader(new InputStreamReader(file));
CSVFormat csvFormat = CSVFormat.EXCEL.withHeader().withDelimiter(';');
CSVParser parser = csvFormat.parse( reader );
// now read the records
}
catch (IOException eee)
{
try
{
// try the other valid delimeter
csvFormat = CSVFormat.EXCEL.withHeader().withDelimiter(',');
parser = csvFormat.parse( reader );
// now read the records
}
catch (IOException eee)
{
// then its really not a valid CSV file
}
}
Is there a way to check the delimiter first, or perhaps allow two delimiters? Anyone have a better idea than just catching an exception?
We built support for this in uniVocity-parsers:
public static void main(String... args) {
CsvParserSettings settings = new CsvParserSettings();
settings.setDelimiterDetectionEnabled(true);
CsvParser parser = new CsvParser(settings);
List<String[]> rows = parser.parseAll(file);
}
The parser has many more features that I'm sure you will find useful. Give it a try.
Disclaimer: I'm the author of this library, it's open source and free (apache 2.0 license)