Știința datelor

Python NumPy histogram() tutorial

Python NumPy histogram() tutorial
A histogram is a mapping of intervals to frequencies. It is used to approximate the probability density function of the particular variable. It is known as the bar graph also. Many options are available in python for building and plotting histograms. NumPy library of python is useful for scientific and mathematical operations. One of this library's important features is to implement histogram by using the histogram() function. This function is used to create the histogram that represents the frequency distribution of data graphically. In the histogram, the class intervals are represented by bins that look like horizontal rectangles, and the variable height represents the frequencies. The knowledge of creating NumPy array is necessary to understand the examples shown in this tutorial.

Syntax:

numpy.histogram(input_array, bins=10, range=None, normed=None, weights=None, density=None)

This function can take six arguments to return the computed histogram of a set of data. The purposes of these arguments are explained below.

This function can return two arrays. One is the hist array that contains the set of histogram data. Another is the edge array that contains the values of the bin.

Example 1: Print the histogram array

The following example shows the use of the histogram() function with a one-dimensional array and the bins argument with the sequential values. An array of 5 integer numbers has been used as an input array, and an array of 5 sequential values has been used as bins value. The content of the histogram array and bin array will print together as output.

# Import NumPy library
import numpy as np
# Call histogram() function that returns histogram data
np_array = np.histogram([10, 3, 8, 9, 7], bins=[2, 4, 6, 8, 10])
# Print the histogram output
print("The output of histogram is : \n", np_array)

Output:

The following output will appear after executing the above script.

Example 2: Print the histogram and bin arrays

The following example shows how the histogram array and the bin array can be created by using the histogram() function. A NumPy array has been created by using arrange() function in the script. Next, the histogram() function has called to return the histogram array and bin array values separately.

# Import NumPy library
import numpy as np
# Create NumPy array using arange()
np_array = np.arange(90)
# Create histogram data
hist_array, bin_array = np.histogram(np_array, bins=[0, 10, 25, 45, 70, 100])
# Print histogram array
print("The data of the histogram array is: ", hist_array)
# Print bin array
print("The data of the bin array is: ", bin_array)

Output:

The following output will appear after executing the above script.

Example 3: Print the histogram and bin arrays based on density argument

The following example shows the use of the density argument of the histogram() function to create the histogram array. A NumPy array of 20 numbers is created by using arange() function. The first histogram() function is called by setting the density value to False. The second histogram() function is called by setting the density value to True.

# import NumPy array
import numpy as np
# Create a NumPy array of 20 sequential numbers
np_array = np.arange(20)
# Calculate the histogram data with false density
hist_array, bin_array = np.histogram(np_array, density=False)
print("The histogram output by setting density to False: \n", hist_array)
print("The output of bin array : \n", bin_array)
# Calculate the histogram data with true density
hist_array, bin_array = np.histogram(np_array, density=True)
print("\nThe histogram output by setting density to True: \n", hist_array)
print("The output of bin array : \n", bin_array)

Output:

The following output will appear after executing the above script.

Example 4: Draw a bar chart using histogram data

You have to install the matplotlib library of python to draw the bar chart before executing this example's script. hist_array and bin_array have been created by using the histogram() function. These arrays have been used in the bar() function of the matplotlib library to create the bar chart.

# import necessary libraries
import matplotlib.pyplot as plt
import numpy as np
# Create histogram dataset
hist_array, bin_array = np.histogram([4, 10, 3, 13, 8, 9, 7], bins=[2, 4, 6, 8, 10, 12, 14])
# Set some configurations for the chart
plt.figure(figsize=[10, 5])
plt.xlim(min(bin_array), max(bin_array))
plt.grid(axis='y', alpha=0.75)
plt.xlabel('Edge Values', fontsize=20)
plt.ylabel('Histogram Values', fontsize=20)
plt.title('Histogram Chart', fontsize=25)
# Create the chart
plt.bar(bin_array[:-1], hist_array, width=0.5, color='blue')
# Display the chart
plt.show()

Output:

The following output will appear after executing the above script.

Conclusion:

The histogram() function has been explained in this tutorial by using various simple examples that will help the readers know the purpose of using this function and apply it properly in the script.

Cum se creează un tabel pivot în Pandas Python
În python-ul panda, tabelul pivot cuprinde sume, numărări sau funcții de agregare derivate dintr-un tabel de date. Funcțiile de agregare pot fi utiliz...
How to Create Pandas DataFrame in Python?
Pandas DataFrame is a 2D (two dimensional) annotated data structure in which data is aligned in the tabular form with different rows and columns. For ...
How to use Python NumPy mean(), min() and max() functions?
Python NumPy library has many aggregate or statistical functions for doing different types of tasks with the one-dimensional or multi-dimensional arra...