I am trying to plot a histogram using the matplotlib.hist()
function but I am not sure how to do it.
I have a list
probability = [0.3602150537634409, 0.42028985507246375,
0.373117033603708, 0.36813186813186816, 0.32517482517482516,
0.4175257731958763, 0.41025641025641024, 0.39408866995073893,
0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327,
0.35398230088495575]
and a list of names(strings).
How do I make the probability as my y-value of each bar and names as x-values?
If you want a histogram, you don't need to attach any 'names' to x-values, as on x-axis you would have data bins:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
np.random.seed(42)
x = np.random.normal(size=1000)
plt.hist(x, density=True, bins=30) # `density=False` would make counts
plt.ylabel('Probability')
plt.xlabel('Data');
You can make your histogram a bit fancier with PDF
line, titles, and legend:
import scipy.stats as st
plt.hist(x, density=True, bins=30, label="Data")
mn, mx = plt.xlim()
plt.xlim(mn, mx)
kde_xs = np.linspace(mn, mx, 301)
kde = st.gaussian_kde(x)
plt.plot(kde_xs, kde.pdf(kde_xs), label="PDF")
plt.legend(loc="upper left")
plt.ylabel('Probability')
plt.xlabel('Data')
plt.title("Histogram");
However, if you have limited number of data points, like in OP, a bar plot would make more sense to represent your data (then you may attach labels to x-axis):
x = np.arange(3)
plt.bar(x, height=[1,2,3])
plt.xticks(x, ['a','b','c'])