Implicit Line Joining

Let’s learn how to perform different operations with pandas DataFrame values.

We'll cover the following

Try it yourself

Try executing the code below to see the result.

Press + to interact
import pandas as pd
df = pd.DataFrame([
['133.43.96.45', pd.Timedelta('3s')],
['133.68.18.180', pd.Timedelta('2s')],
['133.43.96.45', pd.NaT],
['133.43.96.45', pd.Timedelta('4s')],
['133.43.96.45', pd.Timedelta('2s')],
], columns=['ip', 'duration'])
by_ip = (
df['duration']
.fillna(pd.Timedelta(seconds=1))
.groupby(df['ip'])
.sum()
)
print(by_ip)

Explanation

The surprising fact here is that the teaser contains valid Python code. Python’s use of white space is pretty unique in programming languages. Some programmers don’t like it. However, the white space does make the code more readable.

The Python documentation has this to say:

“A logical line is constructed from one or more physical lines by following the explicit or implicit line joining rules.”

And a bit later, the same documentation goes on to say the following:

“Expressions in parentheses, square brackets, or curly braces can be split over more than one physical line without using backslashes.”

What we can infer from the documentation is listed below:

  • Neither 'a' nor 'b' is valid.
  • There are two tuples: ('a', 'b') and (a , b).
  • The tuple ('a', 'b') is the string 'ab'.

We can use this implicit line joining to make our code more clear. For complex operations, we can do method chaining, which is what was done in the teaser.

The pandas.DataFrame has a pipe method which can be used in chaining.

When constructing lists or tuples in multiple lines, we should add a dangling comma (also called trailing comma or final comma) after the last entry.

colors = [
'red',
'green',
'blue', # ← A dangling comma
]

Not only will the dangling comma save us from bugs, but there will also be only one line change in code reviews if we add another color. Sadly, not every language or format allows for dangling commas.

Press + to interact
colors = [
'red',
'green',
'blue', # ← A dangling comma
]
print(colors)
colors_2 = [
'red',
'green',
'blue'
]
print(colors_2)

Get hands-on with 1300+ tech skills courses.