I am trying to play with some online data, and having some trouble plotting it due to an 'Attribute' error in the plot function
# Reading data from an online data sets
import pandas as pd
import requests, zipfile, StringIO
r = requests.get('https://archive.ics.uci.edu/ml/machine-learning-databases/00287/Activity Recognition from Single Chest-Mounted Accelerometer.zip')
z = zipfile.ZipFile(StringIO.StringIO(r.content))
activity_files = [name for name in z.namelist() if name.endswith('.csv')]
# Loading it to a pandas dataframe
z_data = z.read(activity_files[4]).split('\n')
activity_data = pd.DataFrame([z.split(',') for z in z_data], columns=('Seq','Ax','Ay','Az','Label'))
# Filtering
working_desk_data = activity_data[activity_data.Label == '1']
standing_data = activity_data[activity_data.Label == '3']
walking_data = activity_data[activity_data.Label == '4']
# Plotting
plt.plot(walking_data['Seq'], walking_data['Ax']) # <--- Error
plt.plot(walking_data['Seq'], walking_data['Ay']) # <--- Error
plt.plot(walking_data['Seq'], walking_data['Az']) # <--- Error
plt.show()
Any workarounds or pointing me to the right direction would be helpful ? I can plot the following, so I am clearly misunderstanding something above.
plt.plot(range(1,5), [1,2,1,2])
plt.show()
Edit: (Added data for Julien Spronck)
walking_data.head()
Out[12]:
Seq Ax Ay Az Label
22950 22950 1978 2386 1988 4
22951 22951 1977 2387 1990 4
22952 22952 1983 2390 1994 4
22953 22953 1978 2396 1994 4
22954 22954 1980 2387 1992 4
walking_data.columns
Out[79]:
Index([u'Seq', u'Ax', u'Ay', u'Az', u'Label'], dtype='object')
In [80]:
type(walking_data.Seq)
Out[80]:
pandas.core.series.Series
In [81]:
type(walking_data.Ax)
Out[81]:
pandas.core.series.Series
plot
is getting confused because you're passing it strings, not numbers. If you convert them to (say) float
s:
walking_data = walking_data.astype(float)
Then you'll get