what does this mean? xarray error: cannot handle a non-unique multi-index

Y. Peng picture Y. Peng · Jan 3, 2019 · Viewed 15k times · Source

I am trying to convert a dataframe to xarray. The head is like this:

z   Class    DA       x          y          iline      xline      idz                                                      
2     651   289  1455.0        2.0        0.62239  2345322.0  76720.0
            290  1460.0        0.0        0.46037  2345322.0  76720.0
            291  1465.0        4.0        0.41280  2345322.0  76720.0
            292  1470.0        0.0        0.39540  2345322.0  76720.0
            293  1475.0        2.0        0.61809  2345322.0  76720.0

when I use xr.DataSet.from_dataframe, or df.to_xarray, I got the following error message:

cannot handle a non-unique multi-index!

Anybody know what is going on here?

Answer

shoyer picture shoyer · Jan 3, 2019

The multi-index of your data frame has duplicate entries, which xarray cannot unstack into a multi-dimensional array -- the elements of the hypothetical arrays would not have unique values.

You need to remove the duplicated entries in the index first, e.g., as described in Remove pandas rows with duplicate indices:

  • The simplest choice would be to drop duplicates, e.g., df[~df.index.duplicated()]
  • You might also use a groupby operation, e.g., to compute the mean: df.groupby(level=df.index.names).mean()

Once you've done this, you can safely convert the dataframe into xarray.