Pandas Set_Index : set_index()
Reset_index() function to make the index start from
0. This function transfers the index values into the DataFrame’s columns and
set a simple integer index. This is inverse operation to set_index()
function. The Syntax is:
DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')
Reset the
index, or a level of it.
Reset the index
of the DataFrame, and use the default one instead. If the DataFrame has a
MultiIndex, this method can remove one or more levels.
Parameters:
Name |
Description |
Type/Default
Value |
Required /
Optional |
level |
For a Series
with a MultiIndex, only remove the specified levels from the index. Removes
all levels by default. |
int, str,
tuple, or list, |
optional |
drop |
Just reset
the index, without inserting it as a column in the new DataFrame. |
bool |
Required |
name |
The name to
use for the column containing the original Series values. Uses self.name by
default. This argument is ignored when drop is True. |
object |
optional |
inplace |
Modify the
Series in place (do not create a new object). |
bool |
Required |
import pandas as pd
import numpy as np
df = pd.DataFrame([('bird',
389.0),
('bird', 24.0),
('mammal', 80.5),
('mammal', np.nan)],
index=['falcon', 'parrot', 'lion',
'monkey'],
columns=('class',
'max_speed'))
print(df)
print("\nWhen we reset
the index, the old index is added as a column, and a new sequential index is
used:")
print(df.reset_index())
print("\nWe can use
the drop parameter to avoid the old index being added as a column:")
print(df.reset_index(drop=True))
Output
In this example, it is shown how one of the
columns of the dataframe is used for setting the index through set_index() function.
>>> df = pd.DataFrame({'month': [3, 5, 7, 9, 11],
'year': [2011,
2013, 2015, 2017, 2019],
'sale': [85,
40, 78, 87,97]})
>>> df
Output
month |
year |
sale |
|
0 |
3 |
2011 |
85 |
1 |
5 |
2013 |
40 |
2 |
7 |
2015 |
78 |
3 |
9 |
2017 |
87 |
4 |
11 |
2019 |
97 |
As shown the earlier index is discarded and
month column is used for index.
>>> df.set_index('month')
Output
year |
sale |
|
month |
||
3 |
2011 |
85 |
5 |
2013 |
40 |
7 |
2015 |
78 |
9 |
2017 |
87 |
11 |
2019 |
97 |
Example 2: Using
multiple columns as index
In this example, a couple of columns are
used for setting the index in the set_index() function of pandas.
>>> df.set_index(['year',
'month'])
Output
sale |
||
year |
month |
|
2011 |
3 |
85 |
2013 |
5 |
40 |
2015 |
7 |
78 |
2017 |
9 |
87 |
2019 |
11 |
97 |
Example 3: Using
set_index function on series data
In this example, set_index function is
passed with series data. The series data is then appended to the existing
dataframe as a column
>>> s = pd.Series([3,6,9,12,15])
>>> s
Output :
0 3
1 6
2 9
3 12
4 15
dtype:
int64
>>> df.set_index([s, s**3])
Output:
month |
year |
sale |
||
3 |
27 |
3 |
2011 |
85 |
6 |
216 |
5 |
2013 |
40 |
9 |
729 |
7 |
2015 |
78 |
12 |
1728 |
9 |
2017 |
87 |
15 |
3375 |
11 |
2019 |
97 |
Pandas
Reset_Index : reset_index()
The pandas reset_index() function is used
for resetting the index of dataframe.
Syntax
pandas.DataFrame.reset_index(level, drop, inplace,
col_level, col_fill)
level : int, str, tuple, or list, default
None |
– It is used to specify the levels which
needs to be dropped. |
drop : bool |
– For resetting the index to default
integer index value. |
inplace : bool |
– For modifying the dataframe inplace. |
col_level : int or str |
– This helps in selection of the columns
that have multiple levels, it determines which level the labels are inserted
into. |
col_fill : object |
– If the columns have multiple levels,
determines how the other levels are named. |
The pandas reset_index function returns a
dataframe with new index or nothing is returned.
Example 1:
Simple example of reset_index() function
Here a dataframe is created and then, using
reset_index() function, the dataframe is provided with an index.
>>> df = pd.DataFrame([('fruit', 389.0),
('fruit',
np.nan),
('vegetable',
80.5),
('vegetable',
450.5 )],
index=['kiwi',
'mango', 'potato', 'tomato'],
columns=('type',
'water_content'))
>>> df
Output :
type |
water_content |
|
kiwi |
fruit |
389.0 |
mango |
fruit |
NaN |
potato |
vegetable |
80.5 |
tomato |
vegetable |
450.5 |
>>> df.reset_index()
Output :
index |
type |
water_content |
|
0 |
kiwi |
fruit |
389.0 |
1 |
mango |
fruit |
NaN |
2 |
potato |
vegetable |
80.5 |
3 |
tomato |
vegetable |
450.5 |
Example 2: Using
level parameter with multiindex in reset_index function
In this example, the reset_index function
is provided level parameter. Here we have created MultiIndex and then
reset_index function is used.
>>> index = pd.MultiIndex.from_tuples([('B-class', 'BMW'),
('B-class', 'Audi'),
('A-class', 'Jaguar'),
('A-class', 'Mercedes')],
names=['class', 'name'])
>>> columns = pd.MultiIndex.from_tuples([('speed', 'max'),
('company', 'type')])
>>> df = pd.DataFrame([(389.0, 'Sedan'),
( 24.0, 'Sedan'),
( 80.5,
'Hatchback'),
(np.nan,
'Sports')],
index=index,
columns=columns)
>>> df
Output :
speed |
company |
||
max |
type |
||
class |
name |
||
B-class |
BMW |
389.0 |
Sedan |
Audi |
24.0 |
Sedan |
|
A-class |
Jaguar |
80.5 |
Hatchback |
Mercedes |
NaN |
Sports |
Using reset_index function, we are able to
pass the class value to level parameter. As we can see, the class parameter is
used as index.
>>> df.reset_index(level='class')
Output :
class |
speed |
company |
|
max |
type |
||
name |
|||
BMW |
B-class |
389.0 |
Sedan |
Audi |
B-class |
24.0 |
Sedan |
Jaguar |
A-class |
80.5 |
Hatchback |
Mercedes |
A-class |
NaN |
Sports |
No comments:
Post a Comment