SumitArya
SumitArya

Reputation: 121

Fetch out positional Indexes where a request pair get matched/occured in a web access session

I have this subset of a dataframe

lf = structure(list(session_id = c(48L, 48L, 48L, 48L, 48L, 48L, 54L, 
54L, 54L, 54L, 54L, 54L, 72L, 72L, 72L, 72L, 72L, 74L, 74L, 74L, 
74L, 74L, 78L, 78L, 78L, 78L, 78L, 90L, 90L, 90L), datetime = structure(c(1457050110, 
1457050111, 1457050112, 1457050114, 1457050117, 1457050118, 1457052045, 
1457052048, 1457052050, 1457052051, 1457052052, 1457052054, 1457057067, 
1457057067, 1457057067, 1457057070, 1457057071, 1457058143, 1457058143, 
1457058144, 1457058149, 1457058150, 1457059193, 1457059193, 1457059195, 
1457059198, 1457059199, 1457063485, 1457063486, 1457063486), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), request = c(7, 7, 14, 20, 9, 4, 9, 
1, 12, 20, 6, 12, 4, 15, 8, 8, 12, 10, 6, 6, 13, 1, 5, 6, 20, 
1, 8, 3, 6, 13)), .Names = c("session_id", "datetime", "request"
), row.names = c(NA, -30L), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"))

Now i am checking a certain pair of request (req1,req2) is matching or not in given dataframe and if there is a match then fetch the positional indexes where match occurred.

I am using this piece of code :-

lf1 = lf %>% group_by(session_id) %>% do(positions = match(c(1,6),.$request)) 

As you see i am taking request pair (1,6) as an instance for demonstration purpose.

Desired Output :-

enter image description here

If possible i want to filter out those session_ids got NAs . So only session_ids having both match position1 and position2 must not be NA

Upvotes: 1

Views: 37

Answers (1)

akrun
akrun

Reputation: 887711

If we need the summarised output from 'lfl', ungroup the data, then filter out rows with list elements having any NA and mutate to create 'position1' and 'position2'

library(dplyr)
library(purrr)
lf1 %>%
   ungroup %>% 
   filter(map_lgl(positions, ~all(!is.na(.)))) %>%
   mutate(position1 = map_int(positions, ~.[1]), position2 = map_int(positions, ~.[2]))
# A tibble: 3 x 4
#  session_id positions position1 position2
#        <int>    <list>     <int>     <int>
#1         54 <int [2]>         2         5
#2         74 <int [2]>         5         2
#3         78 <int [2]>         4         2

Upvotes: 2

Related Questions