Mustapha hakkou
Mustapha hakkou

Reputation: 1

How Do I Merge These Two Datasets?

I have two datasets. I would like to merge using the index.

The 1st data set:

index       A   B   C
01/01/2010  15  20  30
15/01/2010  12  15  25
17/02/2010  14  13  35
19/02/2010  11  10  22

The 2nt data set:

index  year  month     price
0      2010  january   70
1      2010  february  80

I want them to be joined like:

index       A   B   C   price
01/01/2010  15  20  30  70
15/01/2010  12  15  25  70
17/02/2010  14  13  35  80
19/02/2010  11  10  22  80

The problem is how to use two columns (year and month of the 2nd dataset) to create a temporary datetime index.

Upvotes: 0

Views: 70

Answers (2)

Matthieu Gloriès
Matthieu Gloriès

Reputation: 59

here is the stupid answer! I'm sure you can do smarter than that :) But that works, considering your tables are a list of dictionaries (you can easily convert your SQL tables in this format). I'm aware this is not a clean solution, but you asked for an easy one, probably that's the easiest to understand :)

months = {'january': "01",
          'february': "02",
          'march': "03",
          'april':"04",
          'may': "05",
          'june': "06",
          'july': "07",
          'august': "08",
          'september': "09",
          'october': "10",
          'november': "11",
          'december': "12"}

table1 = [{'index': '01/01/2010', 'A': 15, 'B': 20, 'C': 30},
          {'index': '15/01/2010', 'A': 12, 'B': 15, 'C': 25},
          {'index': '17/02/2010', 'A': 14, 'B': 13, 'C': 35},
          {'index': '19/02/2010', 'A': 11, 'B': 10, 'C': 22}]

table2 = [{'index': 0, 'year': 2010, 'month': 'january', 'price':70},
          {'index': 1, 'year': 2010, 'month': 'february', 'price':80}]

def joiner(table1, table2):
    for row in table2:
        row['tempDate'] = "{0}/{1}".format(months[row['month']], str(row['year']))
    for row in table1:
        row['tempDate'] = row['index'][3:]

    table3 = []
    for row1 in table1:
        row3 = row1.copy()
        for row2 in table2:
            if row2['tempDate'] == row1['tempDate']:
                row3['price'] = row2['price']
                break
        table3.append(row3)
    return(table3)

table3 = joiner(table1, table2)
print(table3)

Upvotes: 0

shaik moeed
shaik moeed

Reputation: 5785

Try this, by extracting .month_name() and year(.dt.year) from df1 and merge it with df2

>>> df1      
        index   A   B   C
0  01/01/2010  15  20  30
1  15/01/2010  12  15  25
2  17/02/2010  14  13  35
3  19/02/2010  11  10  22
>>> df2              
   index  year     month  price
0      0  2010   january     70
1      1  2010  february     80

# merging df1 and df2 by month and year.
>>> df1.merge(df2,
              left_on = [pd.to_datetime(df1['index']).dt.year,
                 pd.to_datetime(df1['index']).dt.month_name().str.lower()],
              right_on = ['year', 'month'])

Output:

      index_x   A   B   C  index_y  year     month  price
0  01/01/2010  15  20  30        0  2010   january     70
1  15/01/2010  12  15  25        0  2010   january     70
2  17/02/2010  14  13  35        1  2010  february     80
3  19/02/2010  11  10  22        1  2010  february     80

Upvotes: 2

Related Questions