I have a question about kmeans clustering in python.
So I did the analysis that way:
from sklearn.cluster import KMeans
km = KMeans(n_clusters=12, random_state=1)
new = data._get_numeric_data().dropna(axis=1)
km.fit(new)
predict=km.predict(new)
How can I add the column with cluster results to my first dataframe "data" as an additional column? Thanks!
Assuming the column length is as the same as each column in you dataframe df
, all you need to do is this:
df['NEW_COLUMN'] = pd.Series(predict, index=df.index)