Keval Shah
Keval Shah

Reputation: 25

Assign 1 value to random sample of group where the sample size is equal to the value of another column

I want to randomly assign 1 value to the IsShade column (output) such that value 1 can be assigned only D times (see column Shading for ex 2 times or 5 times or 3 times) and have to iterate it for E times (Total column for ex 6 times or 8 times or 5 times)

There are 1 million rows of dataset and attached is sample input and image.

Input:

In[1]: 
    Sr  Series  Parallel  Shading  Total  Cell
0    0       3         2        2      6     1
1    1       3         2        2      6     2
2    2       3         2        2      6     3
3    3       3         2        2      6     4
4    4       3         2        2      6     5
5    5       3         2        2      6     6
6    6       4         2        5      8     1
7    7       4         2        5      8     2
8    8       4         2        5      8     3
9    9       4         2        5      8     4
10  10       4         2        5      8     5
11  11       4         2        5      8     6
12  12       4         2        5      8     7
13  13       4         2        5      8     8
14  14       5         1        3      5     1
15  15       5         1        3      5     2
16  16       5         1        3      5     3
17  17       5         1        3      5     4
18  18       5         1        3      5     5

If you can help me in how to achieve or python code that will be helpful. Thank you and appreciate it.

Example Expected Output:

Out[1]: 
    Sr  Series  Parallel  Shading  Total  Cell  IsShade
0    0       3         2        2      6     1        0
1    1       3         2        2      6     2        0
2    2       3         2        2      6     3        1
3    3       3         2        2      6     4        0
4    4       3         2        2      6     5        0
5    5       3         2        2      6     6        1
6    6       4         2        5      8     1        1
7    7       4         2        5      8     2        0
8    8       4         2        5      8     3        1
9    9       4         2        5      8     4        1
10  10       4         2        5      8     5        0
11  11       4         2        5      8     6        0
12  12       4         2        5      8     7        1
13  13       4         2        5      8     8        1
14  14       5         1        3      5     1        0
15  15       5         1        3      5     2        1
16  16       5         1        3      5     3        0
17  17       5         1        3      5     4        1
18  18       5         1        3      5     5        1

Upvotes: 1

Views: 106

Answers (1)

David Erickson
David Erickson

Reputation: 16683

You can create a new column that does a .groupby and randomly selects x number of rows based off the integer in the Shading column using .sample. From there, I returned True or False and converted to an integer (True becomes 1 and False becomes 0 with .astype(int)):

s = df['Series'].ne(df['Series'].shift()).cumsum() #s is a unique identifier group
df['IsShade'] = (df.groupby(s, group_keys=False)
                   .apply(lambda x: x['Shading'].sample(x['Shading'].iloc[0])) > 0)
df['IsShade'] = df['IsShade'].fillna(False).astype(int)
df
Out[1]: 
    Sr  Series  Parallel  Shading  Total  Cell  IsShade
0    0       3         2        2      6     1        0
1    1       3         2        2      6     2        0
2    2       3         2        2      6     3        0
3    3       3         2        2      6     4        0
4    4       3         2        2      6     5        1
5    5       3         2        2      6     6        1
6    6       4         2        5      8     1        1
7    7       4         2        5      8     2        1
8    8       4         2        5      8     3        0
9    9       4         2        5      8     4        0
10  10       4         2        5      8     5        1
11  11       4         2        5      8     6        1
12  12       4         2        5      8     7        1
13  13       4         2        5      8     8        0
14  14       5         1        3      5     1        1
15  15       5         1        3      5     2        0
16  16       5         1        3      5     3        0
17  17       5         1        3      5     4        1
18  18       5         1        3      5     5        1

Upvotes: 2

Related Questions