Reputation: 56
I have DataFrame with almost 500 rows and 3 columns.
One of the columns has a string of dates and each cell has a unique date, but some cell have a common date and some cells are seem empty.
I'm trying to find the frequency of each day in a cell
df|Number_of_dates | Date
--|--------------------|---------------------
0 | 0.0 | []
1 | 3.0 | ['2006-01-01' '2006-03-22' '2019-07-29']
2 | 8.0 | ['2006-01-01' '2006-04-13' '2006-07-18' '2006-...
3 | 1.0 | ['2006-07-18']
4 | 1.0 | ['2019-07-29']
5 | 0.0 | []
6 | 397.0 | ['2019-01-02' '2019-01-03' '2019-01-04' '2019-...
Result:
df_1 |Date | Frequency
-----|------------ |---------------------
0 | 2006-01-01 |2
1 | 2006-03-22 |1
2 | 2006-04-13 |1
3 | 2006-07-18 |2
4 | 2019-07-29 |3
It would be very helpful if you could provide some guidance.
Thanks in advance
additional information: I noticed that each cell has a string value instead of a list
Sample DataFrame
d = {"Date":[ "['2005-02-02' '2005-05-04' '2005-08-03' '2005-11-02' '2006-02-01' '2006-05-03']",
"['2006-01-31' '2006-02-01' '2006-03-16'\n '2006-06-13']",
"['2005-10-12' '2005-10-13' '2005-10-14'\n '2005-10-17']",
"[]",
"['2005-07-25' '2005-07-26' '2005-07-27'\n '2005-07-28' '2005-07-29' '2005-08-01' '2005-08-02' '2005-08-03'\n '2005-08-04' '2005-08-05']",
"['2005-03-15' '2005-03-16' '2005-03-17'\n '2005-03-18' '2005-03-21' '2005-03-22' '2005-03-23' '2005-03-24' \n'2005-03-28' '2005-03-29' '2005-03-30' '2005-03-31' '2005-04-01'\n '2005-04-04']",
"['2005-03-16' '2005-03-17' '2005-07-27'\n '2006-06-13']",
"['2005-02-02' '2005-05-04' '2005-03-16' '2005-03-17']",
"[]"
]
}
df = pd.DataFrame(d)
Upvotes: 0
Views: 80
Reputation: 862481
Use DataFrame.explode
with GroupBy.size
:
#create list from sample data
df['Date'] = df['Date'].str.strip('[]').str.split()
df_1 = df.explode('Date').groupby('Date').size().reset_index(name='Frequency')
print (df_1.head(10))
Date Frequency
0 '2005-02-02' 2
1 '2005-03-15' 1
2 '2005-03-16' 3
3 '2005-03-17' 3
4 '2005-03-18' 1
5 '2005-03-21' 1
6 '2005-03-22' 1
7 '2005-03-23' 1
8 '2005-03-24' 1
9 '2005-03-28' 1
Upvotes: 2