I have the following data:
Invoice NoStockCode Description Quantity CustomerID Country
536365 85123A WHITE HANGING HEART T-LIGHT HOLDER 6 17850 United Kingdom
536365 71053 WHITE METAL LANTERN 6 17850 United Kingdom
536365 84406B CREAM CUPID HEARTS COAT HANGER 8 17850 United Kingdom
I am trying to do a groupby so i have the following operation:
df.groupby(['InvoiceNo','CustomerID','Country'])['NoStockCode','Description','Quantity'].apply(list)
I want to get the output
|Invoice |CustomerID |Country |NoStockCode |Description |Quantity
|536365| |17850 |United Kingdom |85123A, 71053, 84406B |WHITE HANGING HEART T-LIGHT HOLDER, WHITE METAL LANTERN, CREAM CUPID HEARTS COAT HANGER |6, 6, 8
Instead I get:
|Invoice |CustomerID |Country |0
|536365| |17850 |United Kingdom |['NoStockCode','Description','Quantity']
I have tried agg and other methods, but I haven't been able to get all of the columns to join as a list. I don't need to use the list function, but in the end I want the different columns to be lists.
I can't reproduce your code right now, but I think that:
print (df.groupby(['InvoiceNo','CustomerID','Country'],
as_index=False)['NoStockCode','Description','Quantity']
.agg(lambda x: list(x)))
would give you the expected output