Pandas: Creating DataFrame from Series

BMichell picture BMichell · May 7, 2014 · Viewed 103.2k times · Source

My current code is shown below - I'm importing a MAT file and trying to create a DataFrame from variables within it:

mat = loadmat(file_path)  # load mat-file
Variables = mat.keys()    # identify variable names

df = pd.DataFrame         # Initialise DataFrame

for name in Variables:

    B = mat[name]
    s = pd.Series (B[:,1])

So within the loop I can create a series of each variable (they're arrays with two columns - so the values I need are in column 2)

My question is how do I append the series to the dataframe? I've looked through the documentation and none of the examples seem to fit what I'm trying to do.

Best Regards,

Ben

Answer

Jaan picture Jaan · Jun 28, 2015

Here is how to create a DataFrame where each series is a row.

For a single Series (resulting in a single-row DataFrame):

series = pd.Series([1,2], index=['a','b'])
df = pd.DataFrame([series])

For multiple series with identical indices:

cols = ['a','b']
list_of_series = [pd.Series([1,2],index=cols), pd.Series([3,4],index=cols)]
df = pd.DataFrame(list_of_series, columns=cols)

For multiple series with possibly different indices:

list_of_series = [pd.Series([1,2],index=['a','b']), pd.Series([3,4],index=['a','c'])]
df = pd.concat(list_of_series, axis=1).transpose()

To create a DataFrame where each series is a column, see the answers by others. Alternatively, one can create a DataFrame where each series is a row, as above, and then use df.transpose(). However, the latter approach is inefficient if the columns have different data types.