Reputation: 427
I would like to create a random percentage column where it should sum up to 1
Current Table :
Col1 Value
A 100
B 100
Wanted Example 1 : if there are two rows, then the random value can be any number,but the total has to be 1
Col1 Value Random New Value
A 100 0.7 70
B 100 0.3 30
1
Wanted Example 2 : three rows samples
Col1 Value Random New Value
A 100 0.2 20
B 100 0.1 10
C 100 0.7 70
1
Upvotes: 3
Views: 1155
Reputation: 17882
You can use the numpy
function random.randint
:
df = pd.DataFrame({'Value': [100, 100, 100]})
nums = np.random.randint(10, size=len(df))
df['Random'] = nums / nums.sum()
df['New'] = df['Value'] * df['Random']
df.loc['Sum', :] = df.sum()
Output:
Value Random New
0 100.0 0.1250 12.50
1 100.0 0.3125 31.25
2 100.0 0.5625 56.25
Sum 300.0 1.0000 100.00
Upvotes: 2
Reputation: 15872
You can use np.random.dirichlet
and np.around
:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Col1': list("ABC"), 'Value': [100]*3})
df['random'] = np.around(np.random.dirichlet
(np.ones(df.shape[0]),size=1)[0],
decimals = 1)
df['New value'] = (df['Value']*df['random']).astype(int)
print(df)
Output:
Col1 Value random New value
0 A 100 0.4 40
1 B 100 0.3 30
2 C 100 0.3 30
Upvotes: 2