python - Pandas Dataframes: How to groupby on a groupby? -


i'm trying generalize question asked here.

the mlb dataframe looks like

    player             position          salary     year 0   mike witt          pitcher           1400000    1988 1   george hendrick    outfielder        989333     1988 2   chili davis        outfielder        950000     1988 3   brian downing      designated hitter 900000     1988 4   bob boone          catcher           883000     1988 5   bob boone          catcher           883000     1989 6   frank smith        catcher           993000     1988 7   frank smith        pitcher           1300000    1989 

note same player may listed multiple times different years. i'm trying find the player maximum total salary each position. output should like:

    position           player            salary      0  pitcher            mike witt         1400000  1  outfielder         george hendrick   989333  2  brian downing      designated hitter 900000  3  catcher            bob boone         1766000 

i think need group position, group player, sum each player , find maximum. i'm having trouble doing this.

once positions = mlb.groupby("position") i'm having trouble doing next step. think nested groupby player necessary, don't know how proceed.

this messy gets job done.

df = pd.dataframe({'player':['mike witt','george hendrick','chili davis','brian downing','bob boone','bob boone'],                 'position':['pitcher','outfielder','outfielder','designated hitter','catcher','catcher'],                 'salary':[1400000,989333, 950000,900000,883000,900000],                 'year':[1988,1988,1988,1988,1988,1988]})  gp = df.groupby(['player','position']).sum()['salary'].to_frame().reset_index() gp.sort('salary',ascending=false).drop_duplicates('position') 

or

gp.groupby('position').max() 

like @dawg mentioned, treat player has multiple positions different players salaries per position shown here.

            player           position   salary 0        bob boone            catcher  1783000 4        mike witt            pitcher  1400000 3  george hendrick         outfielder   989333 1    brian downing  designated hitter   900000 

Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -