We are facing an error when we have a column which have datatype as string and the value like col1 col2 1 .89
So, when we are using
def azureml_main(dataframe1 = None, dataframe2 = None):
# Execution logic goes here
print('Input pandas.DataFrame #1:')
import pandas as pd
import numpy as np
from sklearn.kernel_approximation import RBFSampler
x =dataframe1.iloc[:,2:1080]
print x
df1 = dataframe1[['colname']]
change = np.array(df1)
b = change.ravel()
print b
rbf_feature = RBFSampler(gamma=1, n_components=100,random_state=1)
print rbf_feature
print "test"
X_features = rbf_feature.fit_transform(x)
After this we are getting error as cannt convert non int into type float
Use astype(float)
e.g.:
df['col'] = df['col'].astype(float)
or convert_objects
:
df = df.convert_objects(convert_numeric=True)
Example:
In [379]:
df = pd.DataFrame({'a':['1.23', '0.123']})
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a 2 non-null object
dtypes: object(1)
memory usage: 32.0+ bytes
In [380]:
df['a'].astype(float)
Out[380]:
0 1.230
1 0.123
Name: a, dtype: float64
In [382]:
df = df.convert_objects(convert_numeric=True)
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a 2 non-null float64
dtypes: float64(1)
memory usage: 32.0 bytes
UPDATE
If you're running version 0.17.0
or later then convert_objects
has been replaced with the methods: to_numeric
, to_datetime
, and to_timestamp
so instead of:
df['col'] = df['col'].astype(float)
you can do:
df['col'] = pd.to_numeric(df['col'])
note that by default any non convertible values will raise an error, if you want these to be forced to NaN
then do:
df['col'] = pd.to_numeric(df['col'], errors='coerce')