Chris90
Chris90

Reputation: 1998

How to create random floats and add them as a dataframe column

if I have a df below as

date       | id | 
12/02/2012   b2             
12/03/2013   b6            
11/23/2013   b3 

      

If I want to add two new columns with mock or fake data in the form of fake_rates and fake_minutes below where the rates are anywhere from 0.00 to 3.00 and the mins values are anywhere from 0.0 - 30.0

date       | id | fake_rates | fake_minutes
12/02/2012   b2     1.05        2.0
12/03/2013   b6     .56         1.6
12/03/2013   b8     .33         11.2
11/23/2013   b3     .19         122.0

and then group them as

where the rates and minutes are the avg of the date grouped by

example output

 date       | rates | minutes
    12/01/2012   1.39   23.00
    12/02/2012   1.29   22.33

Thanks!

Upvotes: 1

Views: 1490

Answers (1)

Trenton McKinney
Trenton McKinney

Reputation: 62493

  • Use numpy.random.uniform because it has a low and high parameter to specify the value range.
  • Use numpy.round to specify the number of decimal places for the data.
import numpy
import pandas as pd

# setup the dataframe
df = pd.DataFrame({'date': ['12/02/2012', '12/03/2013', '11/23/2013', '12/02/2012', '12/03/2013', '11/23/2013'], 'id': ['b2', 'b6', 'b3', 'b2', 'b6', 'b3']})

# add synthetic data
np.random.seed(365)
df['fake_minutes'] = np.round(np.random.uniform(0.0, 30.0, size=(len(df), 1)), 2)
df['fake_rates'] = np.round(np.random.uniform(0.0, 3.0, size=(len(df), 1)), 2)

# set the date to a datetime format
df.date = pd.to_datetime(df.date)

# display(df)
        date  id  fake_minutes  fake_rates
0 2012-12-02  b2         28.24        2.30
1 2013-12-03  b6         19.25        0.92
2 2013-11-23  b3         20.54        1.33
3 2012-12-02  b2         17.66        0.33
4 2013-12-03  b6         16.32        1.32
5 2013-11-23  b3         11.04        2.26

# groupby and aggregate the mean
dfg = df.groupby('date', as_index=False).agg({'fake_minutes': 'mean', 'fake_rates': 'mean'})

# display(dfg)  # the dates are all unique, so it
        date  fake_minutes  fake_rates
0 2012-12-02        22.950       1.315
1 2013-11-23        15.790       1.795
2 2013-12-03        17.785       1.120

Upvotes: 2

Related Questions