I am trying to load a .arff file into a numpy array using liac-arff library. (https://github.com/renatopp/liac-arff)
This is my code.
import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset.data)
when executing, I am getting the error.
ArffLoader.py", line 8, in <module>
data = np.array(dataset.data)
AttributeError: 'dict' object has no attribute 'data'
I have seen similar threads, Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'. I am new to Python and is not able to resolve this issue. How can I fix this?
dataset
is a dict
. For a dict
, you access the values using the python indexing notation, dataset[key]
, where key
could be a string, integer, float, tuple, or any other immutable data type (it is a bit more complicated than that, more below if you are interested).
In your case, the key is in the form of a string. To access it, you need to give the string you want as an index, like so:
import arff
import numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])
(you also shouldn't put the imports on the same line, although this is just a readability issue)
dataset
is a dict
, which on some languages is called a map
or hashtable
. In a dict
, you access values in a similar way to how you index in a list or array, except the "index" can be any data-type that is "hashable" (which is, ideally, unique identifier for each possible value). This "index" is called a "key". In practice, at least for built-in types and most major packages, only immutable data types or hashable, but there is no actual rule that requires this to be the case.
Do you come from MATLAB
? If so, then you are probably trying to use MATLAB's
struct
access technique. You could think of a dict
as a much faster, more flexible struct
, but syntax for accessing values are is different.