mikebmassey
mikebmassey

Reputation: 8604

How to update a column in Pandas Dataframe

In Pandas, I am trying to add a new column / update an existing column to a data frame (DF2) with a value from another data frame (DF1). I can think of how to do this in SQL

UPDATE DF2
SET DF2['Column'] = DF1['Column']
FROM DF2
JOIN DF1 ON DF1['NonIndexColumn'] = DF2['NonIndexColumn']

Data Example:

d =[{'CustomerID': 1, 'SignUpDate': '2014-01-01'}, {'CustomerID': 2, 'SignUpDate': '2014-02-01'}, {'CustomerID': 3, 'SignUpDate': '2014-03-01'}, {'CustomerID': 4, 'SignUpDate': '2014-04-01'}]
DF1 = pd.DataFrame(data=d)

d2 = [{'OrderID': 1, 'CustomerID': 1, 'OrderDate': '2014-01-15'}, {'OrderID': 2, 'CustomerID': 1, 'OrderDate': '2014-01-15'}, {'OrderID': 3, 'CustomerID': 2, 'OrderDate': '2014-03-28'}, {'OrderID': 4, 'CustomerID': 1, 'OrderDate': '2014-03-29'}, {'OrderID': 5, 'CustomerID': 3, 'OrderDate': '2014-04-28'}, {'OrderID': 6, 'CustomerID': 2, 'OrderDate': '2014-06-01'}, {'OrderID': 7, 'CustomerID': 1, 'OrderDate': '2014-11-06'}, {'OrderID': 8, 'CustomerID': 3, 'OrderDate': '2015-01-28'}, {'OrderID': 9, 'CustomerID': 1, 'OrderDate': '2015-02-15'} ]
DF2 = pd.DataFrame(data=d2)

I am trying to add DF1['SignUpDate'] on to DF2, so that DF2 would look like this:

       CustomerID   OrderDate  OrderID  SignUpDate
0           1  2014-01-15        1      2014-01-01
1           1  2014-01-15        2      2014-01-01
2           2  2014-03-28        3      2014-02-01
3           1  2014-03-29        4      2014-01-01
4           3  2014-04-28        5      2014-03-01
5           2  2014-06-01        6      2014-02-01
6           1  2014-11-06        7      2014-01-01
7           3  2015-01-28        8      2014-03-01
8           1  2015-02-15        9      2014-01-01

I know the merge would allow me to add the column, but I would have to either overwrite the existing DF or create a new one, like this:

DF1 = pd.merge(DF1, DF2) #overwrite
DF3 = pd.merge(DF1, DF2) #new dataframe

Is there not a way to join on one field (maybe an indexed column, maybe not an indexed column) and update / add the field?

Upvotes: 0

Views: 1791

Answers (1)

EdChum
EdChum

Reputation: 394459

Perform a left merge:

In [4]:

DF2.merge(DF1, on='CustomerID', how='left')
Out[4]:
   CustomerID   OrderDate  OrderID  SignUpDate
0           1  2014-01-15        1  2014-01-01
1           1  2014-01-15        2  2014-01-01
2           2  2014-03-28        3  2014-02-01
3           1  2014-03-29        4  2014-01-01
4           3  2014-04-28        5  2014-03-01
5           2  2014-06-01        6  2014-02-01
6           1  2014-11-06        7  2014-01-01
7           3  2015-01-28        8  2014-03-01
8           1  2015-02-15        9  2014-01-01

Upvotes: 1

Related Questions