Insert Columns into a Pandas DataFrame

Let's find out how to add a column to a pandas DataFrame.

We'll cover the following

Try it yourself

Try executing the code below to see the result.

Press + to interact
import pandas as pd
df = pd.DataFrame([
['Sterling', 83.4],
['Cheryl', 97.2],
['Lana', 13.2],
], columns=['name', 'sum'])
df.late_fee = 3.5
print(df)

Explanation

Where did the late_fee column go? Python’s objects are very dynamic. We can add attributes to most of them as we please.

In [1]: class Point:
...: def __init__(self, x, y):
...: self.x, self.y = x, y
In [2]: p = Point(1, 2)
In [3]: p.x, p.y
Out[3]: (1, 2)
In [4]: p.z = 3
In [5]: p.z
Out[5]: 3

Pandas lets us access columns both by square brackets (for example, df[name]) and by attributes (df.name). But, square brackets are a much better choice than attributes in all scenarios.

One reason it’s good to use square brackets is that when we add an attribute to a DataFrame, it doesn’t register as a new column. Another reason is that column names in CSV, JSON, and other formats can contain spaces or other characters that are not valid Python identifiers, meaning we won’t be able to find them through attribute access. If we try to do that, df.product id will fail while df['product id'] will work. The last reason we should only use square brackets is that it’s confusing when we don’t.

In [6]: df.sum
Out[6]:
<bound method DataFrame.sum of name sum
0 Sterling 83.4
1 Cheryl 97.2
2 Lana 13.2>

In the snippet above, we get the DataFrame sum method and not the sum column. Also, most people probably expect late_fee to be a series like the other columns. But, as we can see in the snippet below, it’s not:

In [7]: df.late_fee
Out[7]: 3.5

Sometimes we would like to add metadata to a DataFrame, like the name of the file where the data was read from. Instead of adding a new attribute—for example, df.originating_file = '/path/to/sales.db'—there’s an experimental attribute called attrs for storing metadata in a DataFrame.

In [8]: df.attrs['originating_file'] = '/path/to/sales.db'
In [9]: df.attrs
Out[9]: {'originating_file': '/path/to/sales.db'}

Get hands-on with 1300+ tech skills courses.