Aizzaac
Aizzaac

Reputation: 3318

How to add a column of random numbers to a dataframe by each value in one of the columns?

I have a dataframe of 3 columns: longitude, latitude, name (FIG 1). I need to add a column "altitude" with random numbers fo each name (see FIG 2).

The random numbers must go from 200 to 2000

FIG 1

FIG 1

FIG 2

FIG 2

Upvotes: 0

Views: 98

Answers (2)

CypherX
CypherX

Reputation: 7353

Since, your dummy data was not in a reproducible format, I made my own. Here is the shorter version of the solution. There is also a convenience function provided below (random_update_altitude()). I have also given you a control of the random sequence generated, using the seed argument in the convenience function. This will help you make it reproducible.

Note: you may also choose the type of random number distribution: uniform (np.random.rand, np.random.randint) or normal (np.random.randn).

Code only

ceiling, base = 4000, 0
for i, name in enumerate(df.names.unique()):
        height = (ceiling - base)*np.random.rand()
        df.loc[df['name']==name,'Altitude'] = height

Code with Function (for ease of use)

import numpy as np
import pandas as pd

def random_update_altitude(df, column='Altitude', ceiling=4000, base=0, seed=0):
    if column not in df.columns:
        df[column] = None

    np.random.seed(seed=seed)
    for i, name in enumerate(df.name.unique()):
        height = (ceiling - base)*np.random.rand()
        df.loc[df['name']==name,'Altitude'] = height

    return df    

df = random_update_altitude(df, column='Altitude', ceiling=4000, seed=0)
print(df)

Output:
enter image description here

Dummy Data

def make_dummy_data():
    names = 'abcdefghijklmnopqrstuvwxyz'
    names = list(names.upper())
    df = pd.DataFrame({'name': names[:5] + names[3:7] + names[:3]})
    df = df.sort_values(by=['name']).reset_index(drop=True)
    return df

df = make_dummy_data()
print(df)

Output:

   name
0     A
1     A
2     B
3     B
4     C
5     C
6     D
7     D
8     E
9     E
10    F
11    G

Upvotes: 2

ansev
ansev

Reputation: 30920

IIUC, DataFrame.groupby + transform

#import numpy as np
df['altitude']=df.groupby('name').name.transform(lambda x: np.random.randint(200,2000))

Upvotes: 3

Related Questions