Python Tutorial: Pandas DataFrame-Indexing

Friday, 4 September 2020

Pandas DataFrame-Indexing

Pandas Set_Index : set_index()

Reset_index() function to make the index start from 0. This function transfers the index values into the DataFrame’s columns and set a simple integer index. This is inverse operation to set_index() function. The Syntax is:

DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')

Reset the index, or a level of it.

Reset the index of the DataFrame, and use the default one instead. If the DataFrame has a MultiIndex, this method can remove one or more levels.

Parameters:

Name

Description

Type/Default Value

Required / Optional

level

For a Series with a MultiIndex, only remove the specified levels from the index. Removes all levels by default.

int, str, tuple, or list,

optional

drop

Just reset the index, without inserting it as a column in the new DataFrame.

bool
Default Value: False

Required

name

The name to use for the column containing the original Series values. Uses self.name by default. This argument is ignored when drop is True.

object

optional

inplace

Modify the Series in place (do not create a new object).

bool
Default Value: False

Required

 Example:

import pandas as pd

import numpy as np

df = pd.DataFrame([('bird', 389.0),

                   ('bird', 24.0),

                   ('mammal', 80.5),

                   ('mammal', np.nan)],

                  index=['falcon', 'parrot', 'lion', 'monkey'],

                  columns=('class', 'max_speed'))

print(df)

print("\nWhen we reset the index, the old index is added as a column, and a new sequential index is used:")

print(df.reset_index())

print("\nWe can use the drop parameter to avoid the old index being added as a column:")

print(df.reset_index(drop=True))

Output

 Example 1: Using column heading as index

In this example, it is shown how one of the columns of the dataframe is used for setting the index through set_index() function.

>>> df = pd.DataFrame({'month': [3, 5, 7, 9, 11],

                       'year': [2011, 2013, 2015, 2017, 2019],

                       'sale': [85, 40, 78, 87,97]})

>>> df

Output

month

year

sale

0

3

2011

85

1

5

2013

40

2

7

2015

78

3

9

2017

87

4

11

2019

97

As shown the earlier index is discarded and month column is used for index.

>>> df.set_index('month')

Output

year

sale

month

3

2011

85

5

2013

40

7

2015

78

9

2017

87

11

2019

97

Example 2: Using multiple columns as index

In this example, a couple of columns are used for setting the index in the set_index() function of pandas.

>>> df.set_index(['year', 'month'])

Output

sale

year

month

2011

3

85

2013

5

40

2015

7

78

2017

9

87

2019

11

97

Example 3: Using set_index function on series data

In this example, set_index function is passed with series data. The series data is then appended to the existing dataframe as a column

>>> s = pd.Series([3,6,9,12,15])

>>> s

Output :

0     3

1     6

2     9

3    12

4    15

dtype: int64

>>> df.set_index([s, s**3])

Output:

month

year

sale

3

27

3

2011

85

6

216

5

2013

40

9

729

7

2015

78

12

1728

9

2017

87

15

3375

11

2019

97

Pandas Reset_Index : reset_index()

The pandas reset_index() function is used for resetting the index of dataframe.

Syntax

pandas.DataFrame.reset_index(level, drop, inplace, col_level, col_fill)

level : int, str, tuple, or list, default None 

– It is used to specify the levels which needs to be dropped.

drop : bool 

– For resetting the index to default integer index value.

inplace : bool 

– For modifying the dataframe inplace.

col_level : int or str 

– This helps in selection of the columns that have multiple levels, it determines which level the labels are inserted into.

col_fill : object 

– If the columns have multiple levels, determines how the other levels are named.

 

The pandas reset_index function returns a dataframe with new index or nothing is returned.

Example 1: Simple example of reset_index() function

Here a dataframe is created and then, using reset_index() function, the dataframe is provided with an index.

>>> df = pd.DataFrame([('fruit', 389.0),

                    ('fruit', np.nan),

                    ('vegetable', 80.5),

                    ('vegetable', 450.5 )],

                  index=['kiwi', 'mango', 'potato', 'tomato'],

                   columns=('type', 'water_content'))

>>> df

Output :

type

water_content

kiwi

fruit

389.0

mango

fruit

NaN

potato

vegetable

80.5

tomato

vegetable

450.5

>>> df.reset_index()

Output :

index

type

water_content

0

kiwi

fruit

389.0

1

mango

fruit

NaN

2

potato

vegetable

80.5

3

tomato

vegetable

450.5

Example 2: Using level parameter with multiindex in reset_index function

In this example, the reset_index function is provided level parameter. Here we have created MultiIndex and then reset_index function is used.

>>> index = pd.MultiIndex.from_tuples([('B-class', 'BMW'),

                                    ('B-class', 'Audi'),

                                   ('A-class', 'Jaguar'),

                                    ('A-class', 'Mercedes')],

                                   names=['class', 'name'])

>>> columns = pd.MultiIndex.from_tuples([('speed', 'max'),

                                      ('company', 'type')])

>>> df = pd.DataFrame([(389.0, 'Sedan'),

                    ( 24.0, 'Sedan'),

                    ( 80.5, 'Hatchback'),

                    (np.nan, 'Sports')],

                  index=index,

                   columns=columns)

>>> df

Output :

speed

company

max

type

class

name

B-class

BMW

389.0

Sedan

Audi

24.0

Sedan

A-class

Jaguar

80.5

Hatchback

Mercedes

NaN

Sports

Using reset_index function, we are able to pass the class value to level parameter. As we can see, the class parameter is used as index.

>>> df.reset_index(level='class')

Output :

class

speed

company

max

type

name

BMW

B-class

389.0

Sedan

Audi

B-class

24.0

Sedan

Jaguar

A-class

80.5

Hatchback

Mercedes

A-class

NaN

Sports

No comments:

Post a Comment