Python Tutorial: Pandas DataFrame-II Select or Access Data

Friday, 21 August 2020

Pandas DataFrame-II Select or Access Data

 Selecting or Accessing Data

You can access and retrieve the records from a dataframe through slicing. Slicing shall result in the display of retrieved records as per the range defined with the dataframe object.

1.   Selecting/Accessing a Column

Selecting a column is easy, just use the following syntax:

<DataFrame object>[column name]               # using square brackets

Or

<DataFrame object>.<column name>            # using dot(.) notation

 Example:

import pandas as pd

SData={"name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,98,54,76,87],'English':[89,87,54,89,43,67],\

       'Bst':[65,67,87,56,87,54]}

df=pd.DataFrame(SData)

print(df)

print("Selecting a Column using square brackets")

print(df['Accounts'])

print("Selecting a Column using dot(.) notation")

print(df.English)

Output

2.   Selecting/Accessing Multiple Columns

<DataFrame object>[[<column name>,<column name>, ……..]]      

Example

import pandas as pd

SData={"name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,98,54,76,87],'English':[89,87,54,89,43,67],\

       'Bst':[65,67,87,56,87,54]}

Sno=['Sno1','Sno2','Sno3','Sno4','Sno5','Sno6']

df=pd.DataFrame(SData,index=Sno)

print(df)

print("Selecting/Accessing Multiple Columns")

print(df[['Accounts','Bst']])

Output:

3.   Selecting/Accessing a subset from a Dataframe using Row/Column Names

Using loc  is a label based indexing and gets rows or columns with the particular labels from index.Syntax:

<DataFrame object>.loc [<startrow>:<endrow>,<startcolumn>:<endcolumn>]

The above system is a general syntax through which you can single/multiple rows/ columns.

(a)       To access a row, just give the row name/label as this :

<DataFrame object>.loc[<row label>,:]

Example

import pandas as pd

SData={"name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,98,54,76,87],'English':[89,87,54,89,43,67],\

       'Bst':[65,67,87,56,87,54]}

Sno=['Sno1','Sno2','Sno3','Sno4','Sno5','Sno6']

df=pd.DataFrame(SData,index=Sno)

print(df)

print("To Access Row")

print(df.loc['Sno2',:])     

Output 

To access selective columns,

<DataFrame object>.loc[:,<start column>:<end column>]

Make sure not to miss the COLUMN BEFORE COMMA. Like rows, all columns falling between start and end columns, will also be listed:

Example:

import pandas as pd

SData={"name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,98,54,76,87],\

       'English':[89,87,54,89,43,67],\

       'Bst':[65,67,87,56,87,54],\

       'IP':[98,76,98,56,87,99]}

Sno=['Sno1','Sno2','Sno3','Sno4','Sno5','Sno6']

df=pd.DataFrame(SData,index=Sno)

print(df)

print("To access selective columns")

print(df.loc[:,'Accounts':'IP'])

print()

print(df.loc[:,'Accounts':'English'])

Output:

To access range of columns from range of rows, use:

<DataFrame object>.loc [<startrow>:<endrow>,<startcolumn>:<endcolumn>]

Example:

import pandas as pd

SData={"name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,98,54,76,87],\

       'English':[89,87,54,89,43,67],\

       'Bst':[65,67,87,56,87,54],\

       'IP':[98,76,98,56,87,99]}

Sno=['Sno1','Sno2','Sno3','Sno4','Sno5','Sno6']

df=pd.DataFrame(SData,index=Sno)

print(df)

print("To access range of columns from range of rows")

print(df.loc['Sno2':'Sno5','Accounts':'IP'])

Output:

 

A Subset/Slice from a DataFrame using Row/Column Numeric Index/Position

Sometimes, your dataframe object does not contain row or column labels or even you may not remember them. In such cases, you can extract subset from dataframe using the row and column numeric/position, but this time you will use iloc(integer location) instead of loc.

<DataFrame object>.iloc [<start row index>:<end row index>,<start column index>:<end column index>]

Example:

import pandas as pd

SData={"name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,98,54,76,87],\

       'English':[89,87,54,89,43,67],\

       'Bst':[65,67,87,56,87,54],\

       'IP':[98,76,98,56,87,99]}

Sno=['Sno1','Sno2','Sno3','Sno4','Sno5','Sno6']

df=pd.DataFrame(SData,index=Sno)

print(df)

print("To access range of columns from numeric index/Position")

print(df.iloc[0:2,1:2])

print("To access range of columns from numeric index/Position")

print(df.iloc[0:2,1:4])

Output


The loc and iloc are very flexible and can be used in variety of ways

Selecting/Accessing Single Value Methods

(i)                 Either give name of row or numeric index in square brackets

<DF object>.<column>[<row name or row numeric index>]

Example:

>>> df.English['Sno3']

54

>>> df.English[2]

54

(ii)                 Access a Single value for a row/column pair (using iat() function )

The iat() function is used to access a single value for a row/column pair by integer position.

Similar to iloc, in that both provide integer-based lookups. Use iat if you only need to get or set a single value in a DataFrame or Series.e

<DF object>.iat [<row index>,<column index>]              

Example :

import pandas as pd

SData={"Name":['Taran','Vinay','Vinita','Rishabh','Ravi','Manoj'],\

       'Accounts':[54,76,91,54,76,87],\

       'English':[89,85,65,89,43,67],\

       'Bst':[65,67,83,78,80,54],\

       'IP':[98,76,98,60,32,99]}

df=pd.DataFrame(SData)

print(df)

print("show row index 4 and cloumn index 3 ->",df.iat[4,3])

Output:

Get value at specified row/column pair

      >>> df.iat[3,3]

78

Set value at specified row/column pair 

>>> df.iat[3,3]=100

>>> df.iat[3,3]

100

Get value within a series 

>>> df.loc[0].iat[2]

89

DataFrame- at() function : The at() function is used to access a single value for a row/column label pair.

Similar to loc, in that both provide label-based lookups. Use at if you only need to get or set a single value in a DataFrame or Series.

(a)   Get value at specified row/column pair:

(a)   Set value at specified row/column pair:

(a)   Get value within a Series:

1 comment: