Reputation: 195
I am trying to make a flag for the questions that were answered. Below is a sample data frame.
userid message type
1 hi incoming
1 how may I help you outgoing
1 looking for a job incoming
1 whats your name outgoing
1 nitin incoming
1 kansal incoming
1 whats your age outgoing
2 hi incoming
2 how may I help you outgoing
3 hi incoming
3 how may I help you outgoing
3 looking for a restaurant incoming
3 can you suggest something incoming
3 whats your name outgoing
So, now the outgoing questions that got an incoming question by the same user id would have a flag. Output dataframe would look like.
userid message type got_response
1 hi incoming
1 how may I help you outgoing 1
1 looking for a job incoming
1 whats your name outgoing 1
1 nitin incoming
1 kansal incoming
1 whats your age outgoing 0
2 hi incoming
2 how may I help you outgoing 0
3 hi incoming
3 how may I help you outgoing 1
3 looking for a restaurant incoming
3 can you suggest something incoming
3 whats your name outgoing 0
Looking for a numpy based solution. I have done this using for loop but the real database has millions of rows, so it takes hours to complete the task.
Upvotes: 0
Views: 68
Reputation: 195
df['Flag'] = ((df['userid'] == df['userid'].shift(-1)) & (df['type'].eq('outgoing') & df['type'].shift(-1).eq('incoming')))
Upvotes: 1