Reputation: 167
I have a list abc[] of size x and I have a data-frame whose shape is 2x. Now, I want to assign the values from list abc[] to a new column in data frame.
When the size of DF is equal or less than the list, I just say:
df['NewCol'] = abc[:df.shape[0]]
When the size of the df is more than the list (in this case twice), I do a for like below:
for i,rowData in df.iterrows():
i = i-1
j = i/2
df['NewCol'].iloc[i] = abc[j]
Here the size of df is exactly twice the size of list. And I will always have the case where the size of df is either twice/thrice the list. So that one entry can be matched to two or three consecutive entries.
Is there any faster way to achieve this?
Upvotes: 1
Views: 711
Reputation: 2436
You could use numpy.repeat
to repeat your list, as you are sure there will always be an integer.
import numpy as np
import pandas as pd
df = pd.DataFrame({'a':np.arange(6)})
abc = [4, 5, 6]
df['NewCol'] = np.repeat(abc, len(df)/len(abc))
df
a NewCol
0 0 4
1 1 4
2 2 5
3 3 5
4 4 6
5 5 6
If you prefer to have the list repeated as a whole, you can use np.tile
:
df['NewCol2'] = np.tile(abc, len(df)/len(abc))
df
a NewCol NewCol2
0 0 4 4
1 1 4 5
2 2 5 6
3 3 5 4
4 4 6 5
5 5 6 6
Upvotes: 0
Reputation: 109626
df = pd.DataFrame(np.random.randn(4, 3), columns=list('ABC'))
abc = ['a', 'b']
I will always have the case where the size of df is either twice/thrice the list.
multiplier = len(df) / len(abc) # Should be 2 or 3 per above condition.
df = df.assign(NewCol=[val for val in abc for _ in range(multiplier)])
>>> df
A B C NewCol
0 -0.262760 1.898977 2.265480 a
1 0.552906 2.144316 -0.942272 a
2 -1.429635 -0.060660 0.756665 b
3 -0.658036 -1.056586 1.458374 b
Upvotes: 1