Python Tutorial: Pandas DataFrame-Delete Column(s) & Row(s)

Sunday, 18 June 2023

Pandas DataFrame-Delete Column(s) & Row(s)

a)    Delete Columns

1.       Using the del column keyword

2.      Using the pop() method

3.      Using the drop() method

b)    Delete Rows

1.       Using drop method

2.      Using .index/numeric property

c)    Delete rows and columns together

 

a)    Delete a column in pandas

Columns can be deleted from an existing Dataframe in three ways:

1.      Using the del column keyword: You can delete a column by del df['column name'].

import pandas as pd
rec  = {"Name":['Amit','Vanshi','Jyoti','Sadhana','Shraddha'],
          "Age":[45,30,35,50,28],'Income':[340000,400000,300000,330000,230000]}
RepoDf = pd.DataFrame(rec)
print(RepoDf)
del RepoDf["Age"]
print(RepoDf)

Output:


2.     Using the pop() method:  pop() will delete the column from a Dataframe by providing the name of the column as an argument. It will return the deleted column along with its values.

Example:

import pandas as pd
rec  = {"Name":['Amit','Vanshi','Jyoti','Sadhana','Shraddha'],
          "Age":[45,30,35,50,28],'Income':[340000,400000,300000,330000,230000]}
RepoDf = pd.DataFrame(rec)
print(RepoDf)
print("\nReturn delete column values")
print(RepoDf.pop("Income"))
print("\nAfter using pop() method rest of dataframe")
print(RepoDf)

Output:


3.    Using the drop() method: drop() method will delete the values from a dataframe. The values can be of either row or column.

a)    Delete Single Column

b)    Delete Multiple Columns

c)    Delete columns using columns parameter in drop() function

Syntax :

            drop(labels, axis=1, inplace=True)

Where, labels : column(s) name

            axis=1 means delete column(s) and axis=0 means delete row(s).

              If inplace=True, column will be deleted permanently.

a)    Delete Single column

Example:

import pandas as pd
rec  = {"Name":['Raju','Ashmita','Rajan','Sadhana','Ram Ji'],
          "Age":[48,22,55,60,58],
        'Gender':['M','F','M','F','M'],
        'Income':[340000,400000,300000,330000,230000]}
RepoDf = pd.DataFrame(rec)
print(RepoDf)
print(RepoDf.drop('Gender',axis=1)) # here show remove column temporary 
print(RepoDf) # do not remove column self Dataframe

Output:


Example:

import pandas as pd
rec  = {"Name":['Raju','Ashmita','Rajan','Sadhana','Ram Ji'],
          "Age":[48,22,55,60,58],
        'Gender':['M','F','M','F','M'],
        'Income':[340000,400000,300000,330000,230000]}
RepoDf = pd.DataFrame(rec)
print(RepoDf)
RepoDf.drop('Gender',axis=1,inplace=True)
print(RepoDf) 

Output:


b)    Delete Multiple columns

Syntax :

            drop([column1,column2…..], axis=1, inplace=True)

Example:

import pandas as pd
rec  = {"Name":['Raju','Ashmita','Rajan','Sadhana','Ram Ji'],
          "Age":[48,22,55,60,58],
        'Gender':['M','F','M','F','M'],
        'Income':[340000,400000,300000,330000,230000]}
RepoDf = pd.DataFrame(rec)
print(RepoDf)
print("\nDelete Multiple rows")
RepoDf.drop(['Age','Income'],axis=1,inplace=True)
print(RepoDf)

c)    Delete columns using columns parameter in drop() function

Example:

import pandas as pd
rec  = {"Name":['Raju','Ashmita','Rajan','Sadhana','Ram Ji'],
          "Age":[48,22,55,60,58],
        'Gender':['M','F','M','F','M'],
        'Income':[340000,400000,300000,330000,230000]}
RepoDf = pd.DataFrame(rec)
print(RepoDf)
print("\nDelete Multiple rows")
RepoDf.drop(columns=['Age','Income'],axis=1,inplace=True)
print(RepoDf)

Output:


Delete rows

Using drop method

Python offers drop() function to drop/delete data from dataframe. You can use drop() function in following ways to delete row(s) and columns(s).

a)    Delete rows by index name

You can delete rows by its index name. Observe the following code:

import pandas as pd
dt=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(dt, index=['Akshita','Bharati','Kavita','Anshu','Mayank'])
print(df)
print('Delete Kavita Record')
df=df.drop('Kavita')
print(df)

In the above code, drop() function deletes the specified columns name by identifying its name. Here axis is not specified so it will take by default axis. The default axis is 0 for rows. The axis parameter is commonly used to delete columns with the value 1.

Output:


b)    The following code with the index parameter.

import pandas as pd
dt=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(dt, index=['Akshita','Bharati','Kavita','Anshu','Mayank'])
print(df)
print('Delete Kavita Record')
df=df.drop(index='Anshu')
print(df)


c)    Delete rows using multiple indexes

You can delete multiple rows by passing multiple indexes as parameter values. Observe this code:

import pandas as pd
dt=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(dt, index=['Akshita','Bharati','Kavita','Anshu','Mayank'])
print(df)
print('\nDelete Akshita & Anshu Records')
df=df.drop(index=['Akshita','Anshu'])
print(df)

Output:



OR

import pandas as pd
dt=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(dt, index=['Akshita','Bharati','Kavita','Anshu','Mayank'])
print(df)
print('\nDelete Akshita & Anshu Records')
df=df.drop(index=['Akshita','Anshu'])
print(df)

d)   Delete rows using multiple index lists along with inplace

Similarly, you can use Single or multiple index lists as parameters values. 

Whenever rows deleted from a dataframe a new datafame is returned as output and the old dataframe remains intact. To avoid this problem you can pass inplace parameter with drop() function. Observe the following code:

import pandas as pd
dt=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(dt, index=['Akshita','Bharati','Kavita','Anshu','Mayank'])
print(df)
print('\nDelete Akshita & Anshu Records without using inplace=True')
df.drop(['Akshita','Anshu'])
print(df)
df.drop(['Akshita','Anshu'],inplace=True)
print(df)

e)       Delete rows using .index(numeric) property

The next part is you can delete row(s) through the index property also. Just provide the list of indexes to be deleted. Observe this code:

import pandas as pd
dt=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(dt)
print(df)
df.drop([0,3],inplace=True)
print(df)



Delete rows and columns together

import pandas as pd
stud=({'Name':['Akshita','Bharati','Kavita','Anshu','Mayank'],
       'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(stud)
print(df)
print("\n 1st, 2nd & 3rd row and English & IP column ")
df.drop(index=[0,2,3],columns=['English','IP'],inplace=True)
print(df)

Output:


 

You can use the following ways also to do the same:

print(df.drop(index=df.index[[1, 3, 5]], columns=df.columns[[1, 2]]))

 

Example:

import pandas as pd
stud=({'English':[74,79,48,53,68],
         'Accounts':[76,78,80,76,73],
         'Economics':[57,74,55,89,70],
         'B. Studies':[76,85,63,68,59],
         'IP':[82,93,69,98,79]})
df=pd.DataFrame(stud, index=['Akshita','Bharati','Kavita','Anshu','Mayank'])
print(df)
df.drop(index=['Akshita','Kavita','Anshu'],columns=['English','IP'],inplace=True)
print(df)

Output:


 

No comments:

Post a Comment