How to add k-means predicted clusters in a column to a dataframe in Python

python pandas scikit-learn cluster-analysis k-means

Keithx · Jul 14, 2016 · Viewed 13k times · Source

I have a question about kmeans clustering in python.

So I did the analysis that way:

from sklearn.cluster import KMeans

km = KMeans(n_clusters=12, random_state=1)
new = data._get_numeric_data().dropna(axis=1)
km.fit(new)
predict=km.predict(new)

How can I add the column with cluster results to my first dataframe "data" as an additional column? Thanks!

Answer

Assuming the column length is as the same as each column in you dataframe df, all you need to do is this:

df['NEW_COLUMN'] = pd.Series(predict, index=df.index)

How to add k-means predicted clusters in a column to a dataframe in Python

Answer

Related questions