Reputation: 1460
I have a dataset and want to extract the count of the ID repeated in that column based on other columns session time.
Dataset :
person ID Session type ID
1001 morning 1
1001 evening 1
1002 morning 2
1002 night 1
1001 morning 2
1002 evening 1
1002 night 2
Based on my type ID == 1 I want to get count of them for every session as mentioned below
Output:
person ID morning_count evening_count night_count
1001 1 1 0
1002 0 1 1
Based on my type ID ==1 it should give me count of that fro every session column.
Upvotes: 0
Views: 326
Reputation: 11
Hope this python dictionary based alternative approach would work:
counts = {list(set(dataset['person ID']))[i]:{'morning_count': 0, 'evening_count':0, 'night_count':0} for i in range(list(dataset['person ID']))}
df_len = len(dataset)
for i in range(df_len):
if(dataset[i]['type ID'] == 1):
if(dataset[i]['Session'] == 'morning'):
counts[dataset[i]['person ID']]['morning_count'] += 1
elif(dataset[i]['Session'] == 'evening'):
counts[dataset[i]['person ID']]['evening_count'] += 1
evening_counts[dataset[i]['person ID']] += 1
elif(dataset[i]['Session'] == 'night'):
counts[dataset[i]['person ID']]['night_count'] += 1
print(counts)
Here you could print by using another loop as you require.
This would give the basic idea to get the same result by using only python dictionaries
Upvotes: 1
Reputation: 862851
First create new column withDataFrame.assign
filled integers for matched rows, then use DataFrame.pivot_table
with aggregate sum
and last some data cleaning with DataFrame.add_suffix
, DataFrame.reset_index
and DataFrame.rename_axis
:
df1 = (df.assign(count=df['type ID'].eq(1).astype(int))
.pivot_table(index='person ID',
columns='Session',
values='count',
aggfunc='sum',
fill_value=0)
.add_suffix('_count')
.reset_index()
.rename_axis(None, axis=1))
print (df1)
person ID evening_count morning_count night_count
0 1001 1 1 0
1 1002 1 0 1
Upvotes: 1