Skittles
Skittles

Reputation: 151

Transform Dictionary with date values into Dataframe with date values as counts per each month

I have been struggling with transforming the below x dictionary, into the table/dictionary below.

 x = {'John': 0,
 'Dan': 0,
 'Tim': 0,
 'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
 'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}

I want to put the names (keys) into Column 0, the dates I would like to present just the count per each year-month column

Result:

df = pd.DataFrame(data = {'Name':['John', 'Dan', 'Tim', 'Andrew', 'Jack'],
    '2021-12':[0,0,0,0,1],
   '2022-01' :[0,0,0,0,2],
   '2022-02' :[0,0,0,0,0],
   '2022-03' :[0,0,0,0,3],
   '2022-04' :[0,0,0,1,0],
    '2022-05' :[0,0,0,0,1],
    '2022-06' :[0,0,0,3,3],
    '2022-07' :[0,0,0,0,0],
    '2022-08' :[0,0,0,2,1],
    '2022-09' :[0,0,0,0,0],
    '2022-10' :[0,0,0,0,0],
    '2022-11' :[0,0,0,1,0],
    '2022-12' :[0,0,0,1,0]})

This is the final df result(values are representing the counts per each Month): enter image description here

Upvotes: 1

Views: 819

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

IIUC, you can try something like this:

x = {'John': 0,
 'Dan': 0,
 'Tim': 0,
 'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
 'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}

dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()

dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d')

df_out = dfe.groupby(['name',pd.Grouper(key='dates', freq='M')]).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01-31', axis=1)
df_out = df_out.reindex(pd.date_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
df_out.columns = df_out.columns.strftime('%Y-%m')
print(df_out)

Output:

        2021-12  2022-01  2022-02  2022-03  2022-04  2022-05  2022-06  2022-07  2022-08  2022-09  2022-10  2022-11  2022-12
name                                                                                                                       
Andrew        0        0        0        0        2        0        3        0        2        0        0        1        1
Dan           0        0        0        0        0        0        0        0        0        0        0        0        0
Jack          1        2        0        3        0        1        3        0        1        0        0        0        0
John          0        0        0        0        0        0        0        0        0        0        0        0        0
Tim           0        0        0        0        0        0        0        0        0        0        0        0        0

Working with periods instead of dates and converting to strings....

x = {'John': 0,
 'Dan': 0,
 'Tim': 0,
 'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
 'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}

dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()

dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d').dt.to_period('M')

df_out = dfe.groupby(['name', 'dates']).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01', axis=1)
df_out = df_out.reindex(pd.period_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
print(df_out)

Output:

        2021-12  2022-01  2022-02  2022-03  2022-04  2022-05  2022-06  2022-07  2022-08  2022-09  2022-10  2022-11  2022-12
name                                                                                                                       
Andrew        0        0        0        0        2        0        3        0        2        0        0        1        1
Dan           0        0        0        0        0        0        0        0        0        0        0        0        0
Jack          1        2        0        3        0        1        3        0        1        0        0        0        0
John          0        0        0        0        0        0        0        0        0        0        0        0        0
Tim           0        0        0        0        0        0        0        0        0        0        0        0        0

Upvotes: 2

Related Questions