Reputation: 169
I have the following input table (y):
parameter1 | parameter2 |
---|---|
1 | 12 |
2 | 23 |
3 | 66 |
4 | 98 |
5 | 90 |
6 | 14 |
7 | 7 |
8 | 56 |
9 | 1 |
I would like to randomly allot values from A1 to A9. The output table should look like the following:
parameter1 | parameter2 | parameter3 |
---|---|---|
1 | 12 | A5 |
2 | 23 | A2 |
3 | 66 | A4 |
4 | 98 | A8 |
5 | 90 | A3 |
6 | 14 | A7 |
7 | 7 | A1 |
8 | 56 | A9 |
9 | 1 | A6 |
n = 9
TGn = round(len(y)/n)
idx = set(y.index // TGn)
y = y.apply(lambda x: x.sample(frac=1,random_state=1234)).reset_index(drop=True)
treatment_groups = [f"A{i}" for i in range(1, n+1)]
y['groupAfterRandomization'] = (y.index // TGn).map(dict(zip(idx, treatment_groups)))
I am unable to fill the first row value it prints as NaN. How do I tackle this problem?
Upvotes: 1
Views: 64
Reputation: 71689
Series.sample
We can use sample
with frac=1
to sample the values from the column parameter1
then use radd
to concatenate prefix A
with the sampled values
df['parameter3'] = df['parameter1'].sample(frac=1).astype(str).radd('A').values
parameter1 parameter2 parameter3
0 1 12 A2
1 2 23 A8
2 3 66 A1
3 4 98 A4
4 5 90 A9
5 6 14 A3
6 7 7 A6
7 8 56 A7
8 9 1 A5
Upvotes: 1