Python Tutorial: Pandas DataFrame-Sorting

Tuesday, 22 September 2020

Pandas DataFrame-Sorting

How to Sort Pandas DataFrame (with examples)

You may use df.sort_values in order to sort Pandas DataFrame.

In this short tutorial, you’ll see 4 examples of sorting:

  1. A column in an ascending order
  2. A column in a descending order
  3. By multiple columns – Case 1
  4. By multiple columns – Case 2

Example 1: Sort Pandas DataFrame in an ascending order

# sort - ascending order

import pandas as pd

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],

        'Price': [22000,25000,27000,35000],

        'Year': [2015,2013,2018,2018]

        }

df = pd.DataFrame(cars, columns= ['Brand', 'Price','Year'])

# sort Brand - ascending order

df.sort_values(by=['Brand'], inplace=True)

print (df)


Example 2: Sort Pandas DataFrame in a descending order

Alternatively, you can sort the Brand column in a descending order. To do that, simply add the condition of ascending=False in this manner:

df.sort_values(by=[‘Brand’], inplace=True, ascending=False)

# sort - descending order

import pandas as pd

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],

        'Price': [22000,25000,27000,35000],

        'Year': [2015,2013,2018,2018]

        }

df = pd.DataFrame(cars, columns= ['Brand', 'Price','Year'])

# sort Brand - descending order

df.sort_values(by=['Brand'], inplace=True, ascending=False)

print (df)

Output :


Example 3: Sort by multiple columns – case 1

But what if you want to sort by multiple columns?

In that case, you may use the following template to sort by multiple columns:

df.sort_values(by=[‘First Column’, ‘Second Column’……], inplace=True)

Suppose that you want to sort by both the ‘Year’ and the ‘Price.’ Since you have two records where the Year is 2018 (i.e., for the Ford Focus and Audi A4), then sorting by a second column – the ‘Price’ column –  would be useful:

df.sort_values(by=['Year','Price'], inplace=True)

Example :

          # sort by multiple columns

import pandas as pd

cars={'Brand':['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'], 'Price': [22000,25000,27000,35000],'Year': [2015,2013,2018,2018]}

df = pd.DataFrame(cars, columns= ['Brand', 'Price','Year'])

# sort by multiple columns: Year and Price

df.sort_values(by=['Year','Price'], inplace=True)

print (df)        

Output :



Example 4: Sort by multiple columns – case 2

Finally, let’s sort by the columns of ‘Year’ and ‘Brand’ as follows:

df.sort_values(by=[‘Year’, ‘Brand’], inplace=True)

Example :

# sort by multiple columns

import pandas as pd

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],

        'Price': [22000,25000,27000,35000],

        'Year': [2015,2013,2018,2018]

        }

df = pd.DataFrame(cars, columns= ['Brand', 'Price','Year'])

# sort by multiple columns: Year and Brand

df.sort_values(by=['Year','Brand'], inplace=True)

print (df)

 Output:

No comments:

Post a Comment