Color scatter plot points based on a value in third column?

Gary picture Gary · Mar 8, 2017 · Viewed 16k times · Source

I am currently plotting a scatterplot based on two columns of data. However, I would like to color the datapoints based on a class label that I have in a third column.

The labels in my third column are either 1,2 or 3. How would I color the scatter plot points based on the values in this third column?

plt.scatter(waterUsage['duration'],waterUsage['water_amount'])
plt.xlabel('Duration (seconds)')
plt.ylabel('Water (gallons)')

Answer

DYZ picture DYZ · Mar 8, 2017

The scatter function happily takes a list of numbers representing color. You can play with a colormap, too, if you want (but you don't have to):

plt.scatter(waterUsage['duration'], waterUsage['water_amount'],\
            c=waterUsage['third_column'], cmap=plt.cm.autumn)