Top "Pandas-groupby" questions

To be used for grouping variables together based on a given condition.

What's the equivalent of Panda's value_counts() in PySpark?

I am having the following python/pandas command: df.groupby('Column_Name').agg(lambda x: x.value_counts().max() where …

dataframe count pyspark pandas-groupby
Pandas GroupBy.apply method duplicates first group

My first SO question: I am confused about this behavior of apply method of groupby in pandas (0.12.0-4), it appears …

python pandas group-by pandas-groupby
Pandas Merge two rows into a single row based on columns

I have 2 rows that look like these, ------------------------------ DealName | Target | Acquirer | ----------------------------- ABC-XYZ | ABC | None | ------------------------------ ABC-XYZ | None | XYZ | ------------------------------ …

python python-2.7 pandas pandas-groupby pandasql
Python Pandas Sum Values in Columns If date between 2 dates

I have a dataframe df which can be created with this: data={'id':[1,1,1,1,2,2,2,2], 'date1':[datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.…

python pandas dataframe pandas-groupby melt
Pandas groupby with categories with redundant nan

I am having issues using pandas groupby with categorical data. Theoretically, it should be super efficient: you are grouping and …

python pandas numpy group-by pandas-groupby
pandas get minimum of one column in group when groupby another

I have a pandas dataframe that looks like this: c y 0 9 0 1 8 0 2 3 1 3 6 2 4 1 3 5 2 3 6 5 3 7 4 4 8 0 4 9 7 4 I'd like to groupby y and get the min …

python pandas pandas-groupby
Groupby class and count missing values in features

I have a problem and I cannot find any solution in the web or documentation, even if I think that …

python pandas dataframe group-by pandas-groupby
Pandas: groupby column A and make lists of tuples from other columns?

I would like to aggregate user transactions into lists in pandas. I can't figure out how to make a list …

python pandas dataframe pandas-groupby
What is the meaning of the error cannot handle a non-unique multi index in groupby clause?

I have a dataframe which have three level of index, and I wish to calculate how much a value deviates …

python pandas unique pandas-groupby multi-index
How can we use pandas to generate min, max, mean, median, ...as new columns for the dataframe?

I just pick up pandas. I have a dataframe as follow: DEST MONTH PRICE SOUR TYPE YEAR 0 DEST7 8 159 SOUR4 WEEKEND 2015 1 …

python mean min pandas-groupby autogeneratecolumn