Python Tutorial: Pandas DataFrame-Pivot and Pivot Table exercises

1. An Emp.csv file table contains the following data: (emp.csv)

EmpNo	Name	Department	Salary	Commission	Job
100	Sunita Arora	RESEARCH	45600	4560	Clerk
101	Ashok Singhal	SALES	43000	4300	Salesman
102	Sumit Avasti	SALES	45900	4590	Salesman
103	Nitin Katiyar	RESEARCH	35000	3500	Manager
104	Varun Saini	SALES	42000	4200	Salesman
105	Tarun Ahuja	SALES	48000	4800	Manager
106	Divya Katiyar	ACCOUNTS	42500	4250	Manager
107	Bhavneet	RESEARCH	45000	4500	Analyst
108	Jyotsna Sharma	ACCOUNTS	41000	4100	Clerk
109	Ankita Shukla	SALES	35000	3500	Manager
110	Vanshika Verma	OPERATIONS	45600	4560	Sr. Manager
111	Swati Jaiswal	RESEARCH	47800		Sr. Manager
112	Shradha Singh	OPERATIONS	43600		Clerk

(a) Using above table create a DataFrame called DFemp

(b) Display the department wise total salary.

(d) Display the department wise total and average salary.

(e) Display the department wise maximum and minimum salary.

(f) Display the department and job wise maximum salary.

SOLUTION

import pandas as pd

import numpy as np

DFemp=pd.read_csv('d:/emp.csv')

print(DFemp)

print()

print('department wise total salary')

pv=DFemp.pivot_table(index='Department',values='Salary',aggfunc=np.sum)

print(pv)

print()

pv1=DFemp.pivot_table(index='Department',values='Salary',aggfunc=np.mean)

print(pv1)

print()

pv2=DFemp.pivot_table(index='Department',values='Salary',aggfunc=[np.sum,np.mean])

print(pv2)

print()

pv3=DFemp.pivot_table(index='Department',values='Salary',aggfunc=[np.max,np.min])

print(pv3)

pv4=DFemp.pivot_table(index=['Department','Job'],values='Salary',aggfunc=np.max)

print(pv4)

2. A sample dataset with different columns as given

Table -Item

Itemno	ItemName	Color	Price
1	Ball Pen	Black	15
2	Pencil	Blue	5.5
3	Ball Pen	Green	10.5
4	Gel Pen	Green	11
5	Notebook	Red	15.5
6	Ball Pen	Green	11.5
7	Highlighter	Blue	8.5
8	Gel Pen	Red	12.5
9	P Marker	Blue	8.6
10	Pencil	Green	11.5
11	Ball Pen	Green	10.5

Answer the following questions(assume that the DataFrame name is dfA)

(a) Using the above table, create a DataFrame called df.

(b) Create a pivot table to display item name wise items from DataFrame df.

(d) Create a table to display item name and item number wise price for all colors.

(e) Create a table to display item name and item number wise sum of color values.

(f) Create a table to display item name and item number wise total price for all color items along the rows and columns.

(g) Create a table to display item name wise total price for all colors.

SOLUTION

import pandas as pd

import numpy as np

print("Answer A")

df=pd.read_csv('d:/items.csv')

print(df)

print("Answer B")

print('item name wise items')

pv=df.pivot_table(index='ItemName',columns='Itemno', fill_value='-')

print(pv)

print("Answer C")

pv1=df.pivot_table(index=['ItemName','Itemno'], fill_value='-')

print(pv1)

print("Answer D")

pv2=df.pivot_table(index=['ItemName','Itemno'],columns='Color',fill_value="-")

print(pv2)

print("Answer E")

pv3=df.pivot_table(index=['ItemName'],columns='Color',fill_value="-",aggfunc=np.sum)

print(pv3)

print("Answer F")

pv4=pv3=df.pivot_table(index=['ItemName','Itemno'],columns='Color',aggfunc=np.sum,margins=True)

print(pv4)

print("Answer G")

pv4df.pivot_table(index=['ItemName'],columns='Color',values='Price',fill_value="-", aggfunc=np.sum,margins=True)

print(pv4)

3. A DataFrame contains information Customer.csv:

	Customer	Region	Order_Date	Sales	Month	Year
0	Lamba book Stall	East	2020-04-19	1256000	April	2020
1	Muskan book Stall	South	2019-05-22	1359000	May	2019
2	Sarswati book Stall	North	2020-06-11	1670000	August	2020
3	TM books	West	2020-06-02	1820000	October	2020
4	Narayana Pustak	North	2020-10-11	2560000	August	2020
5	BOOKWELL distributers	West	2018-05-11	1568000	September	2018
6	Jain book Agency	South	2017-06-20	1233000	May	2017
7	New India Agency	North	2016-10-29	1730000	November	2016
8	Mahabharat Book Stall	West	2020-06-02	2560000	January	2020
9	Aviral Bustak Bhandar	South	2015-10-11	1568000	October	2015
	Libra book Seller	East	2018-05-11	1233000	December	2018

SOLUTION

import pandas as pd

import numpy as np

df=pd.read_csv('d:/customer.csv')

print(df)

print("Answer a")

pv=df.pivot_table(index='Region',columns='Year',values='Sales',aggfunc='mean')

print(pv)

print("Answer b")

pv1=df.pivot_table(index='Year',columns='Region',aggfunc='mean')

print(pv1)

print("Answer c")

pv2=df.pivot_table(index='Year',columns='Region',values='Sales',aggfunc='sum')

print(pv2)

print("Answer d")

pv3=df.pivot_table(index='Year',columns='Region',values='Sales',aggfunc=[np.min,np.max])

print(pv3)

print("Answer e")

pv4=df.pivot_table(index='Region',aggfunc='count')

print(pv4)

print("Answer f")

Pv5=df.pivot_table(values='Sales',columns='Region',index='Year')

print(Pv5)

Pv6=df.pivot_table(index='Region',columns='Year',aggfunc=np.mean)

print(Pv6.stack())

4. A data set is given the sales of two products in four different regions.

Region	Year	Product	Unit Sold
Southeast	2018	Air Purifier	87
Northwest	2019	Air Conditioner	165
Southwest	2019	Air Purifier	122
Northeast	2019	Air Conditioner	132
Southeast	2018	Air Conditioner	98
Northeast	2019	Air Purifier	120
Northwest	2018	Air Purifier	137
Southeast	2019	Air Conditioner	83
Northwest	2018	Air Purifier	128
Northwest	2019	Air Conditioner	149
Southwest	2018	Air Purifier	167
Northeast	2018	Air Conditioner	139

Using the above data set answer the following :

a) Create table pivotable to summarize the data into region and product wise total sales.

Solution :

import numpy as np

import pandas as pd

df=pd.read_csv('d:/regions.csv')

print(df)

print()

pv=df.pivot_table(index=['Region','Product'],values='Unit Sold',aggfunc=np.sum)

print(pv)

b) Print the summary report.

5. A data set is given with 11 employee in different cities as given below (EJob.csv) :

Name	Position	City	Age	Sex
Aakash	Manager	Delhi	35	M
Sunakshi	Programmer	Mumbai	37	F
Amitabh	Manager	Kanpur	33	M
Madhuri	Programmer	Mumbai	40	F
Ashish	Manager	Kanpur	27	M
Akshay	Programmer	Kanpur	34	M
Preeti	Programmer	Delhi	26	F
Govinda	Manager	Delhi	30	M
Rani	Manager	Mumbai	28	F
Shilpa	Manager	Kanpur	26	F
Ayushman	Programmer	Delhi	28	M

Using the above data set table answer the following :

a) Create a pivot table to print the positions of average of age of each city.

b) Create a pivot table to print the positions of average of age of city and sex.

c) Create a pivot table to print the average of age for each position and sex.

d) What will be output of the following:

SOLUTION

import numpy as np

import pandas as pd

df=pd.read_csv('d:/narainpython/df/Ejob.csv')

print(df)

print()

print("Answer A :")

pv=df.pivot_table(index=['Position'],columns='City',values='Age',aggfunc=np.mean)

print(pv)

print("Answer B :")

pv=df.pivot_table(index=['Position'],columns=['City','Sex'],values='Age',aggfunc=np.mean)

print(pv)

print("Answer C :")

pv=df.pivot_table(index=['Position','Sex'],columns='City',values='Age',aggfunc=np.mean)

print(pv)

print()

print("Output D :")

print(df.pivot_table(index=['Position'],aggfunc={'Age':np.mean}))

6: A DataFrame contains following data (Employee.csv):

Name	Qualification	Experience
Manorama	Masters	8
Rashmi	Graduate	11
Parul	Post Graduate	7
Satyam	Masters	9
Deepak	Graduate	6
Arif	Post Graduate	8
Adil	Masters	10

Write the command for the following (assume that the Dataframe name is dfT)

(a) Find the average experience for each qualification.

(b) Find the total experience for each qualification.

(d) What will be the following display:

Pva=pd.pivot_table(dfT,indext= “Qualification”, columns= “Qualifiaction”, values= ‘Experience’, aggfunc=np.sum)

Pva.stack()

7. A sample dataset is given with different columns as given below(Product.csv):

Product_ID	ProductName	Manufacturer	Price	CustomerName	City
PC01	Personal Computer	HCL India	42000	Anchal	Delhi
LC05	Laptop	HP USA	55000	Rupal	Mumbai
PC03	Personal Computer	Dell USA	32000	Parul	Delhi
PC06	Personal Computer	Zenith USA	37000	Anirudh	Chennai
LC03	Laptop	Dell USA	57000	Ashish	Bengaluru
AL03	Monitor	HP USA	9800	Saurya	Delhi
CS03	Harddisk	Dell USA	5400	Sachin	Mumbai
PR06	MotherBoard	Intel USA	17500	Divya	Delhi
BA03	UPS	Microtek India	4300	Yashika	Chennai
MC01	Monitor	HCL India	68000	Sakshi	Bengaluru

a) Create a city wise customer table.

b) Create a pivot table for manufacturer wise product names and their price.

c) Arrange the data frame in ascending order of customer name.

d) Arrange the data frame in ascending order of City and Price.

8. A sample dataset is given with four quarter sales data for five employees:

Name of Employee	Sales	Quarter	State
R sahay	125600	1	Delhi
Sachin	235600	1	Tamil Nadu
Jyotsna	213400	1	Kerala
Amit	189000	1	Haryana
Swati	456000	1	West Bengal
Ankita	436500	2	Haryana
R sahay	214300	2	Delhi
Sachin	325400	2	Tamil Nadu
Jyotsna	121200	2	Kerala
Amit	214300	2	Haryana
Swati	214600	2	West Bengal
Ankita	326890	2	Haryana
R sahay	256000	3	Delhi
Sachin	250600	3	Tamil Nadu
Jyotsna	203400	3	Kerala
Amit	197000	3	Haryana
Swati	406000	3	West Bengal
Ankita	436500	4	Haryana
R sahay	256000	4	Delhi
Sachin	250600	4	Tamil Nadu
Jyotsna	203400	4	Kerala

Write the command for the following : (Employees.csv)

(a) Find the total sales of each employee.

(b) Find the total sales by state.

(d) Find the maximum individual sale by state.

(e) Find the mean, meadin and minimum sales by state.

Python Tutorial

Monday, 2 November 2020

Pandas DataFrame-Pivot and Pivot Table exercises

No comments:

Post a Comment