Reputation: 3318
I have a dataframe of 3 columns: longitude, latitude, name (FIG 1). I need to add a column "altitude" with random numbers fo each name (see FIG 2).
The random numbers must go from 200 to 2000
FIG 1
FIG 2
Upvotes: 0
Views: 98
Reputation: 7353
Since, your dummy data was not in a reproducible format, I made my own. Here is the shorter version of the solution. There is also a convenience function provided below (random_update_altitude()
). I have also given you a control of the random sequence generated, using the seed
argument in the convenience function. This will help you make it reproducible.
Note: you may also choose the type of random number distribution: uniform (np.random.rand
, np.random.randint
) or normal (np.random.randn
).
ceiling, base = 4000, 0
for i, name in enumerate(df.names.unique()):
height = (ceiling - base)*np.random.rand()
df.loc[df['name']==name,'Altitude'] = height
import numpy as np
import pandas as pd
def random_update_altitude(df, column='Altitude', ceiling=4000, base=0, seed=0):
if column not in df.columns:
df[column] = None
np.random.seed(seed=seed)
for i, name in enumerate(df.name.unique()):
height = (ceiling - base)*np.random.rand()
df.loc[df['name']==name,'Altitude'] = height
return df
df = random_update_altitude(df, column='Altitude', ceiling=4000, seed=0)
print(df)
def make_dummy_data():
names = 'abcdefghijklmnopqrstuvwxyz'
names = list(names.upper())
df = pd.DataFrame({'name': names[:5] + names[3:7] + names[:3]})
df = df.sort_values(by=['name']).reset_index(drop=True)
return df
df = make_dummy_data()
print(df)
Output:
name
0 A
1 A
2 B
3 B
4 C
5 C
6 D
7 D
8 E
9 E
10 F
11 G
Upvotes: 2
Reputation: 30920
IIUC, DataFrame.groupby
+ transform
#import numpy as np
df['altitude']=df.groupby('name').name.transform(lambda x: np.random.randint(200,2000))
Upvotes: 3