Reputation: 3667
I have df that looks like this
df:
id dob
1 7/31/2018
2 6/1992
I want to generate 88799 random dates to go into column dob
in the dataframe, between the dates of 1960-01-01
to 1990-12-31
while keeping the format mm/dd/yyyy
no time stamp.
How would I do this?
I tried:
date1 = (1960,01,01)
date2 = (1990,12,31)
for i range(date1,date2):
df.dob = i
Upvotes: 1
Views: 3012
Reputation: 51335
I would figure out how many days are in your date range, then select 88799 random integers in that range, and finally add that as a timedelta with unit='d'
to your minimum date:
min_date = pd.to_datetime('1960-01-01')
max_date = pd.to_datetime('1990-12-31')
d = (max_date - min_date).days + 1
df['dob'] = min_date + pd.to_timedelta(pd.np.random.randint(d,size=88799), unit='d')
>>> df.head()
dob
0 1963-03-05
1 1973-06-07
2 1970-08-24
3 1970-05-03
4 1971-07-03
>>> df.tail()
dob
88794 1965-12-10
88795 1968-08-09
88796 1988-04-29
88797 1971-07-27
88798 1980-08-03
EDIT You can format your dates using .strftime('%m/%d/%Y')
, but note that this will slow down the execution significantly:
df['dob'] = (min_date + pd.to_timedelta(pd.np.random.randint(d,size=88799), unit='d')).strftime('%m/%d/%Y')
>>> df.head()
dob
0 02/26/1969
1 04/09/1963
2 08/29/1984
3 02/12/1961
4 08/02/1988
>>> df.tail()
dob
88794 02/13/1968
88795 02/05/1982
88796 07/03/1964
88797 06/11/1976
88798 11/17/1965
Upvotes: 8