I have a dataframe extracted from Kaggle's San Fransico Salaries: https://www.kaggle.com/kaggle/sf-salaries and I wish to create a set of the values of a column, for instance 'Status'.
This is what I have tried but it brings a list of all the records instead of the set (sf is how I name the data frame).
a=set(sf['Status'])
print a
According to this webpage, this should work. How to construct a set out of list items in python?
If you only need to get list of unique values, you can just use unique
method.
If you want to have Python's set, then do set(some_series)
In [1]: s = pd.Series([1, 2, 3, 1, 1, 4])
In [2]: s.unique()
Out[2]: array([1, 2, 3, 4])
In [3]: set(s)
Out[3]: {1, 2, 3, 4}
However, if you have DataFrame, just select series out of it ( some_data_frame['<col_name>']
).