Tuesday, July 18, 2017

Sort by column within multi index level in pandas

Leave a Comment

I have a sorting request per example below.

Do i need to reset_index(), then sort() and then set_index() or is there a slick way to do this?

l = [[1,'A',99],[1,'B',102],[1,'C',105],[1,'D',97],[2,'A',19],[2,'B',14],[2,'C',10],[2,'D',17]] df = pd.DataFrame(l,columns = ['idx1','idx2','col1']) df.set_index(['idx1','idx2'],inplace=True)  # assume data has been received like this... print df             col1 idx1 idx2       1    A       99      B      102      C      105      D       97 2    A       19      B       14      C       10      D       17  # I'd like to sort descending on col1, partitioning within index level = 'idx2'             col1 idx1 idx2       1    C      105      B      102      A       99      D       97  2    A       19      D       17      B       14      C       10 

Thank you for the answer Note I change the data slightly:

l = [[1,'A',99],[1,'B',11],[1,'C',105],[1,'D',97],[2,'A',19],[2,'B',14],[2,'C',10],[2,'D',17]] df = pd.DataFrame(l,columns = ['idx1','idx2','col1']) df.set_index(['idx1','idx2'],inplace=True) df = df.sort_index(by='col1', ascending=False) 

however the output is

idx1 idx2       1    C      105      A       99      D       97 2    A       19      D       17      B       14 1    B       11 2    C       10 

i would have wanted it to be

idx1 idx2       1    C      105      A       99      D       97      B       11  2    A       19      D       17      B       14      C       10 

3 Answers

Answers 1

you can use sort_index:

 df.sort_index(by='col1', ascending=False) 

This outputs:

             col1 idx1    idx2     1       C    105         B    102         A    99         D    97 2       A    19         D    17         B    14         C    10 

Answers 2

You need DataFrame.reset_index, DataFrame.sort_values and DataFrame.set_index::

l = [[1,'A',99],[1,'B',11],[1,'C',105],[1,'D',97],      [2,'A',19],[2,'B',14],[2,'C',10],[2,'D',17]] df = pd.DataFrame(l,columns = ['idx1','idx2','col1']) df.set_index(['idx1','idx2'],inplace=True) print (df)            col1 idx1 idx2       1    A       99      B       11      C      105      D       97 2    A       19      B       14      C       10      D       17  df = df.reset_index() \        .sort_values(['idx1','col1'], ascending=[True,False]) \        .set_index(['idx1','idx2']) print (df)            col1 idx1 idx2       1    C      105      A       99      D       97      B       11 2    A       19      D       17      B       14      C       10 

Answers 3

This first sorts by the desired column, the resorts on the idx1 MultiIndex level only and works in up to date pandas versions that deprecate the by kwarg.

df.sort_values('col1', ascending=False).sort_index(level='idx1', sort_remaining=False) 

Output:

             col1 idx1    idx2     1       C    105         B    102         A    99         D    97 2       A    19         D    17         B    14         C    10 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment