How do I fix invalid literal for int() with base 10 error in pandas

Caribgirl picture Caribgirl · May 9, 2017 · Viewed 19.9k times · Source

This is the error that is showing up whenever i try to convert the dataframe to int.

("invalid literal for int() with base 10: '260,327,021'", 'occurred at index Population1'

Everything in the df is a number. I assume the error is due to the extra quote at the end but how do i fix it?

Answer

piRSquared picture piRSquared · May 9, 2017

I run this

int('260,327,021')

and get this

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-448-a3ba7c4bd4fe> in <module>()
----> 1 int('260,327,021')

ValueError: invalid literal for int() with base 10: '260,327,021'

I assure you that not everything in your dataframe is a number. It may look like a number, but it is a string with commas in it.

You'll want to replace your commas and then turn to an int

pd.Series(['260,327,021']).str.replace(',', '').astype(int)

0    260327021
dtype: int64