Reputation: 1586
I have two dataframes A
and B
:
A=
Date
2016-01
2016-02
2016-03
2016-04
2016-05
B=
Titles
Manager
Direcotr
Associates
I'd like to have a dataframe of size 20x2 as follows:
AB=
Date Title
2016-01 Manager
2016-02 Manager
2016-03 Manager
...
2016-04 Associates
2016-05 Associates
This is what I tried:
columns = ['Date','Title']
b = pd.DataFrame(columns=columns)
for i in A.Date:
for j in B.Title:
b.Date = i
b.Title = j
But this does not work. Any hints would help a lot.
Upvotes: 0
Views: 7080
Reputation: 1
A['dummy'] = 1
B['dummy'] = 1
AB = A.merge(B, on='dummy')
AB = AB.drop('dummy', axis=1)
Upvotes: 0
Reputation: 31
Try this:
import pandas as pd
df1 = pd.DataFrame({'Dates': ['Date0', 'Date1', 'Date2', 'Date3']})
df2 = pd.DataFrame({'Titles': ['Manager', 'Director', 'Associates']})
result = pd.concat([df1, df2],axis = 1, join = 'outer', ignore_index=False, sort=False)
Upvotes: 3
Reputation: 1375
A slightly simpler version of Kiran's answer:
from itertools import product
import pandas as pd
df1 = pd.DataFrame({'Dates': ['Date0', 'Date1', 'Date2', 'Date3']})
df2 = pd.DataFrame({'Titles': ['Manager', 'Director', 'Associates']})
We just want a 1-d list of the values to pass to product
whereas .values
will get us a 2-d array
prod = product(df1['Dates'], df2['Titles'])
And we can construct the DataFrame
directly from this
pd.DataFrame(list(prod),columns=['Date','Title'])
It is generally preferred (and much faster) to build the entire DataFrame
from a list as opposed to row-by-row.
Also, there's of course no reason A
and B
need to be DataFrames. If they're lists or Series
objects, then the code will be simpler yet.
Upvotes: 1
Reputation:
I tried pandas for the first time but maybe this is the answer you are looking for:
from itertools import product
import pandas as pd
df1 = pd.DataFrame({'Dates': ['Date0', 'Date1', 'Date2', 'Date3']},
index=[0, 1, 2, 3])
df2 = pd.DataFrame({'Titles': ['Manager', 'Director', 'Associates']},
index=[0, 1, 2])
columns = columns = ['Date','Title']
result = pd.DataFrame(columns=columns)
prod = product(df1.values, df2.values)
for index, i in enumerate(prod):
result.loc[index] = [i[0][0], i[1][0]]
Which gave me the output
Date Title
0 Date0 Manager
1 Date0 Director
2 Date0 Associates
3 Date1 Manager
4 Date1 Director
5 Date1 Associates
6 Date2 Manager
7 Date2 Director
8 Date2 Associates
9 Date3 Manager
10 Date3 Director
11 Date3 Associates
Upvotes: 1