Reputation: 405
I want to create a dataframe using pandas where 1 column is 'EmployeeID' and the second one is 'skill' set he has ranging form 1 to 5. The 'EmployeeID' column should have unique values whereas the 'skill' column can have repetitive values. 1. I tried to generate the 'EmployeeID' using the below code:
df = pd.DataFrame({'EmployeeID':[random.sample(range(123456,135000),100)]})
but the result is not what i expected. It generated all the numbers and put them in one row
Upvotes: 0
Views: 933
Reputation: 863401
Use numpy.random.randint
+ numpy.tile
if need repeat 1-5
range:
df = pd.DataFrame({'EmployeeID': np.random.randint(123456, 135000, 100),
'skill':np.tile(np.arange(1,6), 20)})
print (df.head(10))
EmployeeID skill
0 129323 1
1 126570 2
2 124034 3
3 129659 4
4 125654 5
5 127093 1
6 123780 2
7 125665 3
8 124063 4
9 125061 5
Also if need random values in range 1-5
for column skill
use double randint
:
df = pd.DataFrame({'EmployeeID': np.random.randint(123456, 135000, 100),
'skill':np.random.randint(1,6, 100)})
print (df.head(10))
EmployeeID skill
0 131496 2
1 133133 4
2 130999 2
3 127685 5
4 129008 1
5 124238 3
6 124147 3
7 123592 3
8 133859 1
9 126097 3
Upvotes: 1