Histograms and Probability Density Function
In this lesson, we will learn about representing data using histograms and probability density functions.
We'll cover the following
Representing data #
One of the most common ways to represent a data set is to draw a histogram. For a histogram, you count how many data points fall within a certain interval. For example, how many data points are between 5 and 6. These intervals are called bins. The bar graph of the number of data points in each bin is called a histogram. The function to compute and plot a histogram is called hist()
and is part of the matplotlib
package. The simplest way of plotting a histogram is to let hist()
decide what bins to use; the default number of bins is nbin=10
.
hist()
even figures out where to put the limits of the bins. The hist()
function creates a histogram graph and returns a tuple of three items:
- The first item is an array of length
nbin
with the number of data points in each bin. - The second item is an array of length
nbin+1
with the limits of the bins. - The third item is a list of objects that represent the bars of the histogram; this is the least used item.
Let’s draw a histogram:
Get hands-on with 1200+ tech skills courses.