Ana
Ana

Reputation: 1586

how to combine two dataframes without a common key in Python

I have two dataframes A and B:

A= Date 2016-01 2016-02 2016-03 2016-04 2016-05

B= Titles Manager Direcotr Associates

I'd like to have a dataframe of size 20x2 as follows:

AB= Date Title 2016-01 Manager 2016-02 Manager 2016-03 Manager ... 2016-04 Associates 2016-05 Associates

This is what I tried:

columns = ['Date','Title']
b = pd.DataFrame(columns=columns)

for i in A.Date:
    for j in B.Title:
        b.Date = i
        b.Title = j

But this does not work. Any hints would help a lot.

Upvotes: 0

Views: 7080

Answers (4)

Yao
Yao

Reputation: 1

A['dummy'] = 1 
B['dummy'] = 1 
AB = A.merge(B, on='dummy')
AB = AB.drop('dummy', axis=1)

Upvotes: 0

Shambhavi Chati
Shambhavi Chati

Reputation: 31

Try this:

    import pandas as pd  
    
    df1 = pd.DataFrame({'Dates': ['Date0', 'Date1', 'Date2', 'Date3']})

    df2 = pd.DataFrame({'Titles': ['Manager', 'Director', 'Associates']})

    result = pd.concat([df1, df2],axis = 1, join = 'outer', ignore_index=False, sort=False)

Upvotes: 3

Victor Chubukov
Victor Chubukov

Reputation: 1375

A slightly simpler version of Kiran's answer:

from itertools import product
import pandas as pd

df1 = pd.DataFrame({'Dates': ['Date0', 'Date1', 'Date2', 'Date3']})

df2 = pd.DataFrame({'Titles': ['Manager', 'Director', 'Associates']})

We just want a 1-d list of the values to pass to product whereas .values will get us a 2-d array

prod = product(df1['Dates'], df2['Titles'])

And we can construct the DataFrame directly from this

pd.DataFrame(list(prod),columns=['Date','Title'])

It is generally preferred (and much faster) to build the entire DataFrame from a list as opposed to row-by-row.

Also, there's of course no reason A and B need to be DataFrames. If they're lists or Series objects, then the code will be simpler yet.

Upvotes: 1

user378704
user378704

Reputation:

I tried pandas for the first time but maybe this is the answer you are looking for:

from itertools import product
import pandas as pd

df1 = pd.DataFrame({'Dates': ['Date0', 'Date1', 'Date2', 'Date3']},
                   index=[0, 1, 2, 3])

df2 = pd.DataFrame({'Titles': ['Manager', 'Director', 'Associates']},
                   index=[0, 1, 2])

columns = columns = ['Date','Title']
result = pd.DataFrame(columns=columns)

prod = product(df1.values, df2.values)
for index, i in enumerate(prod):
    result.loc[index] = [i[0][0], i[1][0]]

Which gave me the output

    Date    Title
0   Date0   Manager
1   Date0   Director
2   Date0   Associates
3   Date1   Manager
4   Date1   Director
5   Date1   Associates
6   Date2   Manager
7   Date2   Director
8   Date2   Associates
9   Date3   Manager
10  Date3   Director
11  Date3   Associates

Upvotes: 1

Related Questions