Multiplying Values of Pandas Series

Let's find out how a pandas series works with the all method and equality operator.

Try it yourself

Try executing the code below to see the result.

Press + to interact
import pandas as pd
v = pd.Series([.1, 1., 1.1])
out = v * v
expected = pd.Series([.01, 1., 1.21])
if (out == expected).all():
print('Math rocks!')
else:
print('Please reinstall universe & reboot.')

Explanation

The out == expected command returns a Boolean pandas.Series. The all method returns True if all elements are True.

When we look at out and expected, they seem the same.

In [1]: out
Out[1]:
0 0.01
1 1.00
2 1.21
dtype: float64
In [2]: expected
Out[2]:
0 0.01
1 1.00
2 1.21
dtype: float64

But, when we compare them, we see something strange.

In [2]: out == expected
Out[2]:
0 False
1 True
2 False
dtype: bool

In both out and expected, only the middle value 1.00 is equal. Looking deeper, we can see the problem.

In [3]: print(out[2])
1.2100000000000002

There’s a difference between how pandas shows the value and how print does.

💡 String representation

Always remember that the string representation of an object is not the object itself. The Treachery of Images painting illustrates this concept beautifully.

A digital rendering of The Treachery of Images

Upon seeing such issues, some new developers come to the message boards and say, “We found a bug in pandas!” The usual answer given by programming veterans is, “Read the manual."

What to do about floating-point issues?

As Grant Edwards once said, “The floating-point is sort of like quantum physics: the closer we look, the messier it gets.”

The basic idea behind this issue is that floating-point numbers sacrifice accuracy for speed. But, it’s a trade-off that we often do a lot in computer science.

The result we see conforms with the floating-point specification. If we run it, we’ll see the same output with the same code in Go, Rust, C, Java, and so on.

The main point we need to remember is that they are not accurate, and as the number increases, accuracy gets even worse.

Floating-point issues arise quite often, so we’ll probably need to compare a pandas.Series or pandas.DataFrame at some point. Please keep in mind that everything won’t exactly equal. Instead, we have the option of coming up with an acceptable threshold and using the numpy.allclose function.

In [4]: import numpy as np
In [5]: np.allclose(out, expected)
Out[5]: True

The numpy.allclose function has many options we can tweak.

Solution

Press + to interact
import numpy as np
import pandas as pd
v = pd.Series([.1, 1., 1.1])
out = v * v
expected = pd.Series([.01, 1., 1.21])
if np.allclose(out, expected):
print('Math rocks!')
else:
print('Please reinstall universe & reboot.')

If we need better accuracy, we can look into the decimal module, which provides correctly rounded decimal floating-point arithmetic using the round() function.

Get hands-on with 1300+ tech skills courses.