Python Tutorial: Pandas DataFrame-Pivot and Pivot Table exercises

Monday, 2 November 2020

Pandas DataFrame-Pivot and Pivot Table exercises

 

1. An Emp.csv file table contains the following data: (emp.csv)

EmpNo

Name

Department

Salary

Commission

Job

100

Sunita Arora

RESEARCH

45600

4560

Clerk

101

Ashok Singhal

SALES

43000

4300

Salesman

102

Sumit Avasti

SALES

45900

4590

Salesman

103

Nitin Katiyar

RESEARCH

35000

3500

Manager

104

Varun Saini

SALES

42000

4200

Salesman

105

Tarun Ahuja

SALES

48000

4800

Manager

106

Divya Katiyar

ACCOUNTS

42500

4250

Manager

107

Bhavneet

RESEARCH

45000

4500

Analyst

108

Jyotsna Sharma

ACCOUNTS

41000

4100

Clerk

109

Ankita Shukla

SALES

35000

3500

Manager

110

Vanshika Verma

OPERATIONS

45600

4560

Sr. Manager

111

Swati Jaiswal

RESEARCH

47800

 

Sr. Manager

112

Shradha Singh

OPERATIONS

43600

 

Clerk

(a)    Using above table create a DataFrame called DFemp

(b)   Display the department wise total salary.

(c)    Display the department wise average salary.

(d)   Display the department wise total and average salary.

(e)   Display the department wise maximum and minimum salary.

(f)     Display the department and job wise maximum salary.

SOLUTION

import pandas as pd

import numpy as np

DFemp=pd.read_csv('d:/emp.csv')

print(DFemp)

print()

print('department wise total salary')

pv=DFemp.pivot_table(index='Department',values='Salary',aggfunc=np.sum)

print(pv)

print()

pv1=DFemp.pivot_table(index='Department',values='Salary',aggfunc=np.mean)

print(pv1)

print()

pv2=DFemp.pivot_table(index='Department',values='Salary',aggfunc=[np.sum,np.mean])

print(pv2)

print()

pv3=DFemp.pivot_table(index='Department',values='Salary',aggfunc=[np.max,np.min])

print(pv3)

pv4=DFemp.pivot_table(index=['Department','Job'],values='Salary',aggfunc=np.max)

print(pv4)

 

2. A sample dataset with different columns as given

Table -Item

Itemno

ItemName

Color

Price

1

Ball Pen

Black

15

2

Pencil

Blue

5.5

3

Ball Pen

Green

10.5

4

Gel Pen

Green

11

5

Notebook

Red

15.5

6

Ball Pen

Green

11.5

7

Highlighter

Blue

8.5

8

Gel Pen

Red

12.5

9

P Marker

Blue

8.6

10

Pencil

Green

11.5

11

Ball Pen

Green

10.5

Answer the following questions(assume that the DataFrame name is dfA)

(a)    Using the above table, create a DataFrame called df.

(b)   Create a pivot table to display item name wise items from DataFrame df.

(c)    Create a table to display item name and item number wise price for all rows.

(d)   Create a table to display item name and item number wise price for all colors.

(e)   Create a table to display item name and item number wise sum of color values.

(f)     Create a table to display item name and item number wise total price for all color items along the rows and columns.

(g)    Create a table to display item name wise total price for all colors.

SOLUTION

import pandas as pd

import numpy as np

print("Answer A")

df=pd.read_csv('d:/items.csv')

print(df)

print("Answer B")

print('item name wise items')

pv=df.pivot_table(index='ItemName',columns='Itemno', fill_value='-')

print(pv)

print("Answer C")

pv1=df.pivot_table(index=['ItemName','Itemno'], fill_value='-')

print(pv1)

print("Answer D")

pv2=df.pivot_table(index=['ItemName','Itemno'],columns='Color',fill_value="-")

print(pv2)

print("Answer E")

pv3=df.pivot_table(index=['ItemName'],columns='Color',fill_value="-",aggfunc=np.sum)

print(pv3)

print("Answer F")

pv4=pv3=df.pivot_table(index=['ItemName','Itemno'],columns='Color',aggfunc=np.sum,margins=True)

print(pv4)

print("Answer G")

pv4df.pivot_table(index=['ItemName'],columns='Color',values='Price',fill_value="-", aggfunc=np.sum,margins=True)

print(pv4)

3. A DataFrame contains information  Customer.csv:

 

Customer

Region

Order_Date

Sales

Month

Year

0

Lamba book Stall

East

2020-04-19

1256000

April

2020

1

Muskan book Stall

South

2019-05-22

1359000

May

2019

2

Sarswati book Stall

North

2020-06-11

1670000

August

2020

3

TM books

West

2020-06-02

1820000

October

2020

4

Narayana Pustak

North

2020-10-11

2560000

August

2020

5

BOOKWELL distributers

West

2018-05-11

1568000

September

2018

6

Jain book Agency

South

2017-06-20

1233000

May

2017

7

New India Agency

North

2016-10-29

1730000

November

2016

8

Mahabharat Book Stall

West

2020-06-02

2560000

January

2020

9

Aviral Bustak Bhandar

South

2015-10-11

1568000

October

2015

 

Libra book Seller

East

2018-05-11

1233000

December

2018

 

SOLUTION

import pandas as pd

import numpy as np

df=pd.read_csv('d:/customer.csv')

print(df)

print("Answer a")

pv=df.pivot_table(index='Region',columns='Year',values='Sales',aggfunc='mean')

print(pv)

print("Answer b")

pv1=df.pivot_table(index='Year',columns='Region',aggfunc='mean')

print(pv1)

print("Answer c")

pv2=df.pivot_table(index='Year',columns='Region',values='Sales',aggfunc='sum')

print(pv2)

print("Answer d")

pv3=df.pivot_table(index='Year',columns='Region',values='Sales',aggfunc=[np.min,np.max])

print(pv3)

print("Answer e")

pv4=df.pivot_table(index='Region',aggfunc='count')

print(pv4)

print("Answer f")

Pv5=df.pivot_table(values='Sales',columns='Region',index='Year')

print(Pv5)

Pv6=df.pivot_table(index='Region',columns='Year',aggfunc=np.mean)

print(Pv6.stack())



4. A data set is given the sales of two products in four different regions.

Region

Year

Product

Unit Sold

Southeast

2018

Air Purifier

87

Northwest

2019

Air Conditioner

165

Southwest

2019

Air Purifier

122

Northeast

2019

Air Conditioner

132

Southeast

2018

Air Conditioner

98

Northeast

2019

Air Purifier

120

Northwest

2018

Air Purifier

137

Southeast

2019

Air Conditioner

83

Northwest

2018

Air Purifier

128

Northwest

2019

Air Conditioner

149

Southwest

2018

Air Purifier

167

Northeast

2018

Air Conditioner

139

Using the above data set answer the following :

a)      Create table pivotable to summarize the data into region and product wise total sales.

Solution :

import numpy as np

import pandas as pd

df=pd.read_csv('d:/regions.csv')

print(df)

print()

pv=df.pivot_table(index=['Region','Product'],values='Unit Sold',aggfunc=np.sum)

print(pv)

 

b)      Print the summary report.

 


5.  A data set is given with 11 employee in different cities as given below (EJob.csv) :

Name

Position

City

Age

Sex

Aakash

Manager

Delhi

35

M

Sunakshi

Programmer

Mumbai

37

F

Amitabh

Manager

Kanpur

33

M

Madhuri

Programmer

Mumbai

40

F

Ashish

Manager

Kanpur

27

M

Akshay

Programmer

Kanpur

34

M

Preeti

Programmer

Delhi

26

F

Govinda

Manager

Delhi

30

M

Rani

Manager

Mumbai

28

F

Shilpa

Manager

Kanpur

26

F

Ayushman

Programmer

Delhi

28

M

Using the above data set table answer the following :

a)      Create a pivot table to print the positions of average of age of each city.

b)      Create a pivot table to print the positions of average of age of city and sex.

c)       Create a pivot table to print the average of age for each position and sex.

d)      What will be output of the following:

SOLUTION

import numpy as np

import pandas as pd

df=pd.read_csv('d:/narainpython/df/Ejob.csv')

print(df)

print()

print("Answer A :")

pv=df.pivot_table(index=['Position'],columns='City',values='Age',aggfunc=np.mean)

print(pv)

print("Answer B :")

pv=df.pivot_table(index=['Position'],columns=['City','Sex'],values='Age',aggfunc=np.mean)

print(pv)

print("Answer C :")

pv=df.pivot_table(index=['Position','Sex'],columns='City',values='Age',aggfunc=np.mean)

print(pv)

print()

print("Output D :")

print(df.pivot_table(index=['Position'],aggfunc={'Age':np.mean}))

 


 

6: A DataFrame contains following data (Employee.csv):

Name

Qualification

Experience

Manorama

Masters

8

Rashmi

Graduate

11

Parul

Post Graduate

7

Satyam

Masters

9

Deepak

Graduate

6

Arif

Post Graduate

8

Adil

Masters

10

Write the command for the following (assume that the Dataframe name is dfT)

(a)    Find the average experience for each qualification.

(b)   Find the total experience for each qualification.

(c)    Find the average experience for each qualification and name.

(d)   What will be the following display:

Pva=pd.pivot_table(dfT,indext= “Qualification”, columns= “Qualifiaction”, values= ‘Experience’, aggfunc=np.sum)

Pva.stack()

7. A sample dataset is given with different columns as given below(Product.csv):

Product_ID

ProductName

Manufacturer

Price

CustomerName

City

PC01

Personal Computer

HCL India

42000

Anchal

Delhi

LC05

Laptop

HP USA

55000

Rupal

Mumbai

PC03

Personal Computer

Dell USA

32000

Parul

Delhi

PC06

Personal Computer

Zenith USA

37000

Anirudh

Chennai

LC03

Laptop

Dell USA

57000

Ashish

Bengaluru

AL03

Monitor

HP USA

9800

Saurya

Delhi

CS03

Harddisk

Dell USA

5400

Sachin

Mumbai

PR06

MotherBoard

Intel USA

17500

Divya

Delhi

BA03

UPS

Microtek India

4300

Yashika

Chennai

MC01

Monitor

HCL India

68000

Sakshi

Bengaluru

 

a)      Create a city wise customer table.

b)      Create a pivot table for manufacturer wise product names and their price.

c)       Arrange the data frame in ascending order of customer name.

d)      Arrange the data frame in ascending order of City and Price.

8. A sample dataset is given with four quarter sales data for five employees:

Name of Employee

Sales

Quarter

State

R sahay

125600

1

Delhi

Sachin

235600

1

Tamil Nadu

Jyotsna

213400

1

Kerala

Amit

189000

1

Haryana

Swati

456000

1

West Bengal

Ankita

436500

2

Haryana

R sahay

214300

2

Delhi

Sachin

325400

2

Tamil Nadu

Jyotsna

121200

2

Kerala

Amit

214300

2

Haryana

Swati

214600

2

West Bengal

Ankita

326890

2

Haryana

R sahay

256000

3

Delhi

Sachin

250600

3

Tamil Nadu

Jyotsna

203400

3

Kerala

Amit

197000

3

Haryana

Swati

406000

3

West Bengal

Ankita

436500

4

Haryana

R sahay

256000

4

Delhi

Sachin

250600

4

Tamil Nadu

Jyotsna

203400

4

Kerala

Write the command for the following : (Employees.csv)

(a)    Find the total sales of each employee.

(b)   Find the total sales by state.

(c)    Find the total sales each by both employees wise and size wise.

(d)   Find the maximum individual sale by state.

(e)   Find the mean, meadin and minimum sales by state.

No comments:

Post a Comment