Reputation: 2580
Say we have dataframes set as follows:
df1 = pd.DataFrame(np.random.randint(0, 2, (10, 2)), columns=['Cow', 'Sheep'])
df2 = pd.DataFrame(np.random.randint(0, 2, (10, 5)), columns=['Hungry', 'Scared', 'Happy', 'Bored', 'Sad'])
df3 = pd.DataFrame(np.random.randint(0, 2, (10, 2)), columns=['Davids', 'Michaels'])
df1.index.name = df2.index.name = df3.index.name = 'id'
combos_to_test = pd.DataFrame([('Davids', 'Cow', 'Hungry'),
('Michaels', 'Cow', 'Hungry'),
('Davids', 'Cow', 'Scared'),
('Michaels', 'Cow', 'Scared'),
('Michaels', 'Sheep', 'Scared'),
('Davids', 'Sheep', 'Happy'),
('Michaels', 'Sheep', 'Happy'),])
example :
DF1: DF2: DF3:
id Cow Sheep id Hungry Scared Happy Bored Sad id Davids Michaels
0 0 1 0 0 1 1 0 1 0 1 0
1 0 0 1 1 0 0 1 1 1 0 1
2 0 0 2 1 0 0 1 1 2 0 0
3 1 0 3 0 0 1 0 1 3 0 1
4 1 0 4 0 0 1 1 0 4 0 1
5 1 1 5 0 0 1 1 0 5 1 0
6 1 1 6 1 0 1 1 0 6 1 0
7 1 0 7 1 1 1 1 0 7 1 1
8 1 1 8 1 1 1 1 0 8 1 0
9 1 0 9 0 1 1 0 0 9 1 0
And I need a 4th dataframe, which finds (for each combination), when each combos_to_test
is a column.
The way I plan on doing this is to change the columns into:
df1.columns = Cow, Cow, Cow, Cow, Sheep, Sheep, Sheep
df2.columns = Hungry, Hungry, Scared, Scared, Happy, Happy
df3.columns = David, Michael, David, Michael, Michael, David, Michael
Then renaming all the cols to col1, col2, col3, ..., col8
and then multiplying each dataframe by eachother (which will vectorise it - but take large amounts of memory).
My dataset is obviously much bigger, and will be using numpy/pandas.
The output df should look like:
('Davids', 'Cow', 'Hungry') | ('Michaels', 'Cow', 'Hungry') | ('Davids', 'Cow', 'Scared') | ('Michaels', 'Cow', 'Scared') | ...
1) 0 1 0 0
2) 0 0 0 0
3) 0 1 0 0
4) 0 0 1 0
5) 0 0 0 0
6) 0 0 0 0
7) 0 0 0 0
8) 0 0 1 1
9) 1 0 0 0
10) 1 0 0 0
Upvotes: 1
Views: 75
Reputation: 294508
I can do this with pd.concat
df = pd.concat([df1, df2, df3], axis=1)
pd.concat({
ctt: df.reindex(columns=ctt).prod(1)
for ctt in map(tuple, combos_to_test.values)
}, axis=1)
Davids Michaels
Cow Sheep Cow Sheep
Hungry Scared Happy Hungry Scared Happy Scared
id
0 0 0 0 0 0 0 0
1 1 1 0 1 1 0 1
2 0 0 0 0 0 0 0
3 0 0 0 0 0 1 0
4 1 1 0 0 0 0 0
5 0 0 0 1 1 1 1
6 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0
Upvotes: 3
Reputation: 49
The easiest way to copy a column is to use just:
df1['Cow_copy'] = df1['Cow']
If you want to copy many columns, you could make a List of columns and loop through it and using the code above for each one.
Upvotes: 0