Managing Missing Data
Learn the common techniques for managing missing data effectively in pandas.
We'll cover the following
Introduction
Managing missing data effectively is a fundamental aspect of data preprocessing in data science. We learned earlier that NaN
values tend to be propagated in pandas
objects during calculations. While this propagation feature can be desirable, we often need to manipulate these NaN
values to achieve accurate and meaningful analysis.
In this lesson, we’ll look at four techniques for managing and remedying missing data:
Filling
Replacing
Interpolating
Dropping
For this lesson, we’ll be using a mock transaction dataset of an e-commerce business, as shown below:
Get hands-on with 1200+ tech skills courses.