Reputation: 1307
I have a data frame that contains a numeric column and I have a list of tuples and a list of strings. The list of tuples represents the values that should be added, where each index in that list corresponds to the numeric column in the data frame. The list of strings represents the names of the to be added columns.
Example:
import pandas as pd
df = pd.DataFrame({'number':[0,0,1,1,2,2,3,3]})
# a list of keys and a list of tuples
keys = ['foo','bar']
combinations = [('99%',0.9),('99%',0.8),('1%',0.9),('1%',0.8)]
Expected output:
number foo bar
0 0 99% 0.9
1 0 99% 0.9
2 1 99% 0.8
3 1 99% 0.8
4 2 1% 0.9
5 2 1% 0.9
6 3 1% 0.8
7 3 1% 0.8
Upvotes: 2
Views: 92
Reputation: 1811
To get that output, you can just try
df2 = pd.DataFrame(combinations, columns = keys)
pd.concat([df, df2], axis=1)
which returns
number foo bar
0 0 99% 0.9
1 1 99% 0.8
2 2 1% 0.9
3 3 1% 0.8
Based on your new requirements, you can use the following
df.set_index('number', inplace=True)
df = df.merge(df2, left_index = True, right_index=True)
df = df.reset_index().rename(columns={'index':'number'})
This also works for different duplicates amounts, i.e.
df = pd.DataFrame({'number':[0,0,1,1,1,2,2,3,3,3]})
returns
number foo bar
0 0 99% 0.9
1 0 99% 0.9
2 1 99% 0.8
3 1 99% 0.8
4 1 99% 0.8
5 2 1% 0.9
6 2 1% 0.9
7 3 1% 0.8
8 3 1% 0.8
9 3 1% 0.8
Upvotes: 2
Reputation: 1307
I found one solution using:
df_new = pd.DataFrame()
for model_number,df_subset in df.groupby('number'):
for key_idx,key in enumerate(keys):
df_subset[key] = combinations[model_number][key_idx]
df_new = df_new.append(df_subset)
But this seems pretty 'dirty' for me, there might be better and more efficient solutions?
Upvotes: 1
Reputation: 18377
You can use list comprehension, in a for
loop, I think it's a pretty fast and straightforward approach:
for i in range(len(keys)):
df[keys[i]] = [x[i] for x in combinations]
Output:
number foo bar
0 0 99% 0.9
1 1 99% 0.8
2 2 1% 0.9
3 3 1% 0.8
Upvotes: 1