Plotting Data using Matplotlib
Plotting using Matplotlib Matplotlib library is used for creating static, animated, and interactive 2D- plots or figures in Python. It can be installed using the following pip command from the command prompt: pip install matplotlib For plotting using Matplotlib, we need to import its Pyplot module using the following command: import matplotlib.pyplot as plt Here, plt is an alias or an alternative name for matplotlib.pyplot. We can use any other alias also.
The pyplot module of matplotlib contains a collection of functions that can be used to work on a plot. The plot() function of the pyplot module is used to create a figure. A figure is the overall window where the outputs of pyplot functions are plotted. A figure contains a plotting area, legend, axis labels, ticks, title, etc. Each function makes some change to a figure: example, creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.
It is always expected that the data presented through charts easily understood. Hence, while presenting data we should always give a chart title, label the axis of the chart and provide legend in case we have more than one plotted data.
To plot x versus y, we can write plt.plot(x,y). The show() function is used to display the figure created using the plot() function. Let us consider that in a city, the maximum temperature of a day is recorded for three consecutive days. Program demonstrates how to plot temperature values for the given dates. The output generated is a line chart.
Program - Plotting Temperature against Height
import matplotlib.pyplot as plt
#list storing date in string format
date=["25/12","26/12","27/12"]
#list storing temperature values
temp=[8.5,10.5,6.8]
#create a figure plotting temp versus date
plt.plot(date, temp)
#show the figure
plt.show()
List
of Pyplot functions to plot different charts
plot(\*args[, scalex, scaley, data]) |
Plot x versus y as lines and/or markers. |
bar(x, height[, width, bottom, align, data]) |
Make a bar plot. |
boxplot(x[, notch, sym, vert, whis, ...]) |
Make a box and whisker plot. |
hist(x[, bins, range, density, weights, ...]) |
Plot a histogram. |
pie(x[, explode, labels, colors, autopct, ...]) |
Plot a pie chart. |
scatter(x, y[, s, c, marker, cmap, norm, ...]) |
A scatter plot of x versus y. |
List of Pyplot functions to
customise plots
grid([b, which,
axis]) |
Configure the grid
lines. |
legend(\*args,
\*\*kwargs) |
Place a legend on the axes. |
savefig(\*args,
\*\*kwargs) |
Save the current
figure. |
show(\*args, \*\*kw)
|
Display all figures. |
title(label[,
fontdict, loc, pad]) |
Set a title for the
axes. |
xlabel(xlabel[,
fontdict, labelpad]) |
Set the label for the x-axis. |
xticks([ticks,
labels]) |
Get or set the
current tick locations and labels of the x-axis. |
ylabel(ylabel[,
fontdict, labelpad]) |
Set the label for the y-axis. |
yticks([ticks,
labels]) |
Get or set the
current tick locations and labels of the y-axis. |
Program 4-2 Plotting a line chart of date versus temperature by adding Label on X and Y axis, and adding a Title and Grids to the chart.
Answer :
import matplotlib.pyplot as plt
date=["25/12","26/12","27/12"]
temp=[8.5,10.5,6.8]
plt.plot(date, temp)
plt.xlabel("Date") #add the Label on x-axis
plt.ylabel("Temperature") #add the Label on y-axis
plt.title("Date wise Temperature") #add the title to the chart
plt.grid(True) #add gridlines to the background
plt.yticks(temp)
plt.show()
Marker
We can make certain other changes to plots by
passing various parameters to the plot() function. In Figure, we plot
temperatures day-wise. It is also possible to specify each point in the line
through a marker.Think and Reflect
A marker is any symbol that represents a data
value in a line chart or a scatter plot. Table 4.3 shows a list of markers
along with their corresponding symbol and description. These markers can be
used in program codes:
Colour abbreviations for plotting
Character |
Colur |
‘b’ |
blue |
‘g’ |
green |
‘r’ |
red |
‘c’ |
cyan |
‘m’ |
magenta |
‘y’ |
yellow |
‘k’ |
black |
‘w’ |
white |
Program - Consider the average heights
and weights of persons aged 8 to 16 stored in the following two lists:
height =
[121.9,124.5,129.5,134.6,139.7,147.3, 152.4, 157.5,162.6]
weight=
[19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6, 43.2]
Let us plot a line chart where:
i. x axis will represent weight
ii. y axis will represent height
iii. x axis label should be “Weight in kg”
iv. y axis label should be “Height in cm”
v. colour of the line should be green
vi. use * as marker
vii. Marker size as10
viii. The title of the chart should be “Average weight with respect to average height”.
ix. Line style should be dashedx. Linewidth should be 2.
Answer
import
matplotlib.pyplot as plt
import
pandas as pd
height=[121.9,124.5,129.5,134.6,139.7,147.3,152.4,157.5,162.6]
weight=[19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6,43.2]
df=pd.DataFrame({"height":height,"weight":weight})
#Set
xlabel for the plot
plt.xlabel('Weight
in kg')
#Set
ylabel for the plot
plt.ylabel('Height
in cm')
#Set
chart title:
plt.title('Average
weight with respect to average height')
#plot
using marker'-*' and line colour as green
plt.plot(df.weight,df.height,marker='*',markersize=10,color='green',linewidth=2, linestyle='dashdot')
plt.show()
The Pandas Plot function (Pandas Visualisation)
we learnt that the plot() function of the pyplot module of matplotlib can be used to plot a chart. However, starting from version 0.17.0, Pandas objects Series and DataFrame come equipped with their own .plot() methods. This plot() method is just a simple wrapper around the plot() function of pyplot. Thus, if we have a Series or DataFrame type object (let's say 's' or 'df') we can call the plot method by writing:s.plot() or df.plot().
The plot() method of Pandas accepts a considerable number of arguments that can be used to plot a variety of graphs. It allows customising different plot types by supplying the kind keyword arguments. The general syntax is: plt.plot(kind),where kind accepts a string indicating the type of .plot, as listed in Table. In addition, we can use the matplotlib.pyplot methods and functions also along with the plt() method of Pandas objects.
Arguments accepted by kind for different plots
Kind = |
Plot Type |
line |
Line plot(default) |
bar |
Vertical bar plot |
barh |
Horizontal bar plot |
hist |
Histogram |
box |
Boxplot |
area |
Area plot |
pie |
Pie plot |
scatter |
Scatter plot |
Plotting a Line chart
A line plot is a graph that shows the frequency of data along a number line. It is used to show continuous dataset. A line plot is used to visualise growth or decline in data over a time interval. We have already plotted line charts through Programs . In this section, we will learn to plot a line chart for data stored in a DataFrame.
Program - Smile NGO has participated in a three week cultural mela. Using Pandas, they have stored the sales (in Rs) made day wise for every week in a CSV file named “MelaSales.csv”, as shown in Table .
Day-wise
mela sales data
Week 1 |
Week 2 |
Week 3 |
5000 |
4000 |
4000 |
5900 |
3000 |
5800 |
6500 |
5000 |
3500 |
3500 |
5500 |
2500 |
4000 |
3000 |
3000 |
5300 |
300 |
5300 |
7900 |
5900 |
6000 |
Depict the sales for the three weeks using a Line chart. It should have the following:
i. Chart title as “Mela Sales Report”.
ii. axis label as Days.
iii. axis label as “Sales in Rs”.
Line colours are red for week 1, blue for week 2 and brown for week 3.
import pandas as pdimport matplotlib.pyplot as plt# reads "MelaSales.csv" to df by giving path to the filedf=pd.read_csv("MelaSales.csv")#create a line plot of different color for each weekdf.plot(kind='line', color=['red','blue','brown'])# Set title to "Mela Sales Report"plt.title('Mela Sales Report')# Label x axis as "Days"plt.xlabel('Days')# Label y axis as "Sales in Rs"plt.ylabel('Sales in Rs')#Display the figureplt.show()
Maker ="*"Marker size=10linestyle="--"Linewidth =3
Plotting Bar Chartimport pandas as pdimport matplotlib.pyplot as pltdf=pd.read_csv("MelaSales.csv")#creates plot of different color for each weekdf.plot(kind='line', color=['red','blue','brown'],marker="*",markersize=10,linewidth=3,linestyle="--")plt.title('Mela Sales Report')plt.xlabel('Days')plt.ylabel('Sales in Rs')#store converted index of DataFrame to a listticks = df.index.tolist()#displays corresponding day on x axisplt.xticks(ticks,df.Day)plt.show()
The line plot in Figure shows that the sales for all the weeks increased during the weekend. Other than
weekends, it also shows that the sales increased on Wednesday for Week 1, on Thursday for Week 2 and on Tuesday for Week 3. But, the lines are unable to efficiently depict comparison between the weeks for which the sales data is plotted. In order to show comparisons, we prefer Bar charts. Unlike line plots, bar charts can plot strings on the x axis. To plot a bar chart, we will specify kind=’bar’. We can also specify the DataFrame columns to be used as x and y axes.
Day-wise sales data along with Day’s names
Week 1 |
Week 2 |
Week 3 |
Day |
5000 |
4000 |
4000 |
Monday |
5900 |
3000 |
5800 |
Tuesday |
6500 |
5000 |
3500 |
Wednesday |
3500 |
5500 |
2500 |
Thursday |
4000 |
3000 |
3000 |
Friday |
5300 |
300 |
5300 |
Saturday |
7900 |
5900 |
6000 |
Sunday |
Program 4-6 This program displays the
Python script to display Bar plot for the “MelaSales.csv” file with column Day
on x axis as shown below in
Answer:
import pandas as pd
df= pd.read_csv('MelaSales.csv')
import matplotlib.pyplot as plt
# plots a bar chart with the column "Days" as x axis
df.plot(kind='bar',x='Day',title='Mela Sales Report')
#set title and set ylabel
plt.ylabel('Sales in Rs')
plt.show()
import pandas as pdimport matplotlib.pyplot as pltdf= pd.read_csv('MelaSales.csv')# plots a bar chart with the column "Days" as x axisdf.plot(kind='bar',x='Day',title='Mela Sales Report',color=['red','yellow','purple'],edgecolor='Green',linewidth=2,linestyle='--')#set title and set ylabelplt.ylabel('Sales in Rs')plt.show()
import pandas as pdimport matplotlib.pyplot as pltdata = {'Name':['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash','Nazar'],'Height' : [60,61,63,65,61,60],'Weight' : [47,89,52,58,50,47]}}df=pd.DataFrame(data)df.plot(kind='hist')plt.show()
df.plot(kind=’hist’,bins=20)df.plot(kind='hist',bins=[18,19,20,21,22])df.plot(kind='hist',bins=range(18,25))
Customising Histogram
Taking the same data as above, now let see how the histogram can be customised. Let us change the edgecolor, which is the border of each hist, to green. Also, let us change the line style to ":" and line width to 2. Let us try another property called fill, which takes boolean values. The default True means each hist will be filled with color and False means each hist will be empty. Another property called hatch can be used to fill to each hist with pattern ( '-', '+', 'x', '\\', '*', 'o', 'O', '.'). In the Program 4-10, we have used the hatch value as "o".
Program
import pandas as pdimport matplotlib.pyplot as pltdata = {'Name':['Arnav', 'Sheela', 'Azhar','Bincy','Yash','Nazar'],'Height' : [60,61,63,65,61,60],'Weight' : [47,89,52,58,50,47]}df=pd.DataFrame(data)df.plot(kind='hist',edgecolor='Green',linewidth=2,linestyle=':',fill=False,hatch='o')plt.show()
No comments:
Post a Comment