I am new to matplotlib, and I want to create a plot, with the following information:
So basically, its somewhat like a continuous box plot.
Thanks!
Using just scipy
and matplotlib
(you tagged only those libraries in your question) is a little bit verbose, but here's how you would do it (I'm doing it only for the quantiles):
import numpy as np
from scipy.stats import mstats
import matplotlib.pyplot as plt
# Create 10 columns with 100 rows of random data
rd = np.random.randn(100, 10)
# Calculate the quantiles column wise
quantiles = mstats.mquantiles(rd, axis=0)
# Plot it
labels = ['25%', '50%', '75%']
for i, q in enumerate(quantiles):
plt.plot(q, label=labels[i])
plt.legend()
Which gives you:
Now, I would try to convince you to try the Pandas library :)
import numpy as np
import pandas as pd
# Create random data
rd = pd.DataFrame(np.random.randn(100, 10))
# Calculate all the desired values
df = pd.DataFrame({'mean': rd.mean(), 'median': rd.median(),
'25%': rd.quantile(0.25), '50%': rd.quantile(0.5),
'75%': rd.quantile(0.75)})
# And plot it
df.plot()
You'll get:
Or you can get all the stats in just one line:
rd.describe().T.drop('count', axis=1).plot()
Note: I dropped the count
since it's not a part of the "5 number summary".