Reputation: 1
I have two datasets. I would like to merge using the index.
The 1st data set:
index A B C
01/01/2010 15 20 30
15/01/2010 12 15 25
17/02/2010 14 13 35
19/02/2010 11 10 22
The 2nt data set:
index year month price
0 2010 january 70
1 2010 february 80
I want them to be joined like:
index A B C price
01/01/2010 15 20 30 70
15/01/2010 12 15 25 70
17/02/2010 14 13 35 80
19/02/2010 11 10 22 80
The problem is how to use two columns (year
and month
of the 2nd dataset) to create a temporary datetime index
.
Upvotes: 0
Views: 70
Reputation: 59
here is the stupid answer! I'm sure you can do smarter than that :) But that works, considering your tables are a list of dictionaries (you can easily convert your SQL tables in this format). I'm aware this is not a clean solution, but you asked for an easy one, probably that's the easiest to understand :)
months = {'january': "01",
'february': "02",
'march': "03",
'april':"04",
'may': "05",
'june': "06",
'july': "07",
'august': "08",
'september': "09",
'october': "10",
'november': "11",
'december': "12"}
table1 = [{'index': '01/01/2010', 'A': 15, 'B': 20, 'C': 30},
{'index': '15/01/2010', 'A': 12, 'B': 15, 'C': 25},
{'index': '17/02/2010', 'A': 14, 'B': 13, 'C': 35},
{'index': '19/02/2010', 'A': 11, 'B': 10, 'C': 22}]
table2 = [{'index': 0, 'year': 2010, 'month': 'january', 'price':70},
{'index': 1, 'year': 2010, 'month': 'february', 'price':80}]
def joiner(table1, table2):
for row in table2:
row['tempDate'] = "{0}/{1}".format(months[row['month']], str(row['year']))
for row in table1:
row['tempDate'] = row['index'][3:]
table3 = []
for row1 in table1:
row3 = row1.copy()
for row2 in table2:
if row2['tempDate'] == row1['tempDate']:
row3['price'] = row2['price']
break
table3.append(row3)
return(table3)
table3 = joiner(table1, table2)
print(table3)
Upvotes: 0
Reputation: 5785
Try this, by extracting .month_name() and year(.dt.year
) from df1
and merge it with df2
>>> df1
index A B C
0 01/01/2010 15 20 30
1 15/01/2010 12 15 25
2 17/02/2010 14 13 35
3 19/02/2010 11 10 22
>>> df2
index year month price
0 0 2010 january 70
1 1 2010 february 80
# merging df1 and df2 by month and year.
>>> df1.merge(df2,
left_on = [pd.to_datetime(df1['index']).dt.year,
pd.to_datetime(df1['index']).dt.month_name().str.lower()],
right_on = ['year', 'month'])
Output:
index_x A B C index_y year month price
0 01/01/2010 15 20 30 0 2010 january 70
1 15/01/2010 12 15 25 0 2010 january 70
2 17/02/2010 14 13 35 1 2010 february 80
3 19/02/2010 11 10 22 1 2010 february 80
Upvotes: 2