Reputation: 288
This is my data frame.
acc_index veh_count veh_type
001 1 1
002 2 1
002 2 2
003 2 1
003 2 2
004 1 1
005 2 1
005 2 3
006 1 2
007 2 1
007 2 2
008 2 1
008 2 1
009 3 1
009 3 1
009 3 2
acc_index is unique for each accident
veh_count shows how many vehicles are involved in one accident
veh_type shows the type of vehicles involved in an accident (1=bicycle, 2=car, 3=bus).
What I want to do is to count the number of accidents between cars and bicycles (so, where veh_type=1 and veh_type=9 for the same acc_index), even if there were more cars or bicycles involved, I still want to count it as one accident. How can I do that?
I tried to do it with the code below, but I get the count of all accidents involving cars or bikes, and I want to get only the ones between them.
df[(df['veh_count'] >=2) & (df.veh_type.isin(['1','2']))].groupby(['acc_index', 'veh_count', 'veh_type']).count()
I want to get something like this below, but also with the whole dataframe, and not only total sum.
acc_index veh_count veh_type count
002 2 1
002 2 2
count 1
003 2 1
003 2 2
count 1
007 2 1
007 2 2
count 1
009 3 1
009 3 1
009 3 2
count 1
total_count 4
If you have a better solution/idea, I would appreciate.
Upvotes: 2
Views: 156
Reputation: 150735
IIUC, you can check veh_type
for those of interest and groupby:
(df.assign(car=df.veh_type.eq(1),
bike=df.veh_type.eq(2)) # change 2 to correct type
[['acc_index','car','bike']]
.groupby('acc_index')
.any()
.all(1).sum()
)
Output:
4
Update: If you want all the the rows:
s = (df.assign(car=df.veh_type.eq(1),
bike=df.veh_type.eq(2)) # change 2 to correct type
[['acc_index','car','bike']]
.groupby('acc_index')
.any()
.all(1)
)
df[df['acc_index'].map(s)]
Output:
acc_index veh_count veh_type
1 2 2 1
2 2 2 2
3 3 2 1
4 3 2 2
9 7 2 1
10 7 2 2
13 9 3 1
14 9 3 1
15 9 3 2
Upvotes: 4