How to plot a time series array, with confidence intervals displayed, in python?

Ștefan picture Ștefan · May 3, 2018 · Viewed 16.1k times · Source

I have some time series which slowly increases, but over a short period of time they are very wavy. For example, the time series could look like:

[10 + np.random.rand() for i in range(100)] + [12 + np.random.rand() for i in range(100)] + [14 + np.random.rand() for i in range(100)] 

I would like to plot the time series with a focus on the general trend, not on the small waves. Is there a way to plot the mean over a period of time surrounded with a stripe indicating the waves (the stripe should represent the confidence interval, where the data point could be in that moment)?

A simple plot would look like this:

enter image description here

The plot which I would like, with confidence intervals would look like this:

enter image description here

Is there an elegant way to do it in Python?

Answer

Ștefan picture Ștefan · May 4, 2018

You could use pandas function rolling(n) to generate the mean and standard deviation values over n consecutive points.

For the shade of the confidence intervals (represented by the space between standard deviations) you can use the function fill_between() from matplotlib.pyplot. For more information you could take a look over here, from which the following code is inspired.

import numpy             as np
import pandas            as pd
import matplotlib.pyplot as plt

#Declare the array containing the series you want to plot. 
#For example:
time_series_array = np.sin(np.linspace(-np.pi, np.pi, 400)) + np.random.rand((400))
n_steps           = 15 #number of rolling steps for the mean/std.

#Compute curves of interest:
time_series_df = pd.DataFrame(time_series_array)
smooth_path    = time_series_df.rolling(n_steps).mean()
path_deviation = 2 * time_series_df.rolling(n_steps).std()

under_line     = (smooth_path-path_deviation)[0]
over_line      = (smooth_path+path_deviation)[0]

#Plotting:
plt.plot(smooth_path, linewidth=2) #mean curve.
plt.fill_between(path_deviation.index, under_line, over_line, color='b', alpha=.1) #std curves.

With the above code you obtain something like this: enter image description here