Unable to parse as integer

svenkapudija picture svenkapudija · Jan 4, 2011 · Viewed 10.2k times · Source

Alright...I have this .txt file (UTF-8)

4661,SOMETHING,3858884120607,24,24.09
4659,SOMETHING1,3858884120621,24,15.95
4660,SOMETHING2,3858884120614,24,19.58

And this code

FileInputStream fis = new FileInputStream(new File(someTextFile.txt));
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader in = new BufferedReader(isr);

int i = 0;
String line;
while((line = in.readLine()) != null) {
Pattern p = Pattern.compile(",");
String[] article = p.split(line);

// I don't know why but when a first line starts with
// an integer - article[0] (which in .txt file is 4661)
// becomes someWeirdCharacter4661 so I need to trim it
// *weird character is like |=>|

if (i == 0) {
    StringBuffer articleCode = new StringBuffer(article[0]);
    articleCode.deleteCharAt(0);
    article[0] = articleCode.toString();
}

SomeArticle**.addOrChange(mContext, Integer.parseInt(article[0]), article[1], article[2], Integer.parseInt(article[3]), Double.parseDouble(article[4]));

i++;
}

On emulator it's fine but on real device (HTC Desire) I get this (weird) error:

E/AndroidRuntime(16422): java.lang.NumberFormatException: unable to parse '4661' as integer

What's the problem?

** it's just some my class which needs those parameters as input (context,int,string,string,int,double)

Answer

sksamuel picture sksamuel · Jan 4, 2011

It could that your file is not UTF8 or something along those lines.

However if you want to hack a fix because you are not interested in the problem just a solution :) then strip out anything that isn't a digit or decimal point.

String[] article = p.split(line);
Integer i = Integer.parseInt(article[0].replaceAll("[^0-9.]",""));

The regular expression isn't perfect (it would affect ...999.... for example) but it will do for you.

EDIT:

I did not read the question properly it seems. If it is only at the start of the file then it is very likely that what you have is a byte order mark, which is used to tell you if the file is unicode and also in UTF16/32 whether it is is little endian or big endian. You don't need tend to see it used very often.

http://unicode.org/faq/utf_bom.html#bom10