I read data from a .csv file to a Pandas dataframe as below. For one of the columns, namely id
, I want to specify the column type as int
. The problem is the id
series has missing/empty values.
When I try to cast the id
column to integer while reading the .csv, I get:
df= pd.read_csv("data.csv", dtype={'id': int})
error: Integer column has NA values
Alternatively, I tried to convert the column type after reading as below, but this time I get:
df= pd.read_csv("data.csv")
df[['id']] = df[['id']].astype(int)
error: Cannot convert NA to integer
How can I tackle this?
The lack of NaN rep in integer columns is a pandas "gotcha".
The usual workaround is to simply use floats.