Is it possible to add multiple data to a pandas.tools.plotting.scatter_matrix
and assigning a color to each group of data?
I'd like to show the scatter plots with data points for one group of data, let's say, in green and the other group in red in the very same scatter matrix. The same should apply for the density plots on the diagonal.
I know that this is possible by using matplotlib's scatter
function, but that does not give me a scatter matrix.
The documentation of pandas is mum on that.
The short answer is determine the color of each dot in the scatter plot, role it into an array and pass it as the color
argument.
Example:
from pandas.tools.plotting import scatter_matrix
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
iris_data = pd.DataFrame(data=iris['data'],columns=iris['feature_names'])
iris_data["target"] = iris['target']
color_wheel = {1: "#0392cf",
2: "#7bc043",
3: "#ee4035"}
colors = iris_data["target"].map(lambda x: color_wheel.get(x + 1))
ax = scatter_matrix(iris_data, color=colors, alpha=0.6, figsize=(15, 15), diagonal='hist')