Reputation: 543

Get time length of friendship in network R

I have a network dataset of adolescent friendships over 7 waves. I'm trying to get the length of a given dyad (directed friendship).

SAMPLE HAVE DATA:

 ego    alter   wave
   1        5      1
   1        4      1
   1        5      2
   1        2      2
   1        3      2
   2        8      1
   2        8      2
   2        8      3
   3        4      1
   3        7      1
   3        6      1
   3        6      2
   3        7      3
   3        6      3

WANT DATA:

 ego    alter   friendship_length
   1        5     2  
   1        4     1 
   1        2     1 
   1        3     1  
   2        8     3        
   3        4     1 
   3        7     1 
   3        6     3

Here's what I've already tried:

edges_wide <- edges_long %>% 
              select(ego, alter, wave) %>%
              group_by(ego, alter) %>% 
              mutate(col=seq_along(ego))%>% # add a column indicator
              spread(key=col, value=wave)

Which gives me this:

 ego    alter   col3    col4    col5
   1        5      1       2      NA
   1        4      1      NA      NA                    
   1        2      2      NA      NA
   1        3      2      NA      NA
   2        8      1       2       3            
   3        4      1      NA      NA
   3        7      1       3      NA
   3        6      1       2       3

From here I'm not sure how to get the wave span (length) of the directed friendship, including not counting non consecutive nominations (like ego 3 alter 7).

Upvotes: 1

Answers (3)

desval

Reputation: 2435

It should be possible to have a shorter solution.

If I understand correctly, you want to count only the first occurrences of subsequent waves in which alter and ego have a relationship. Therefore, we can add a group id with row_number(), adjust for the fact that sometimes waves start after 1 with min(wave)-1, and then just count the observations where wave and this modified id coincide. For a given pair, as soon as one wave is skipped in the data, the two indices will differ.

d %>% 
  arrange(wave) %>% 
  group_by(ego, alter) %>%
  mutate(id = row_number() + min(wave) - 1) %>%
  summarise(friendship_lenght = sum(wave==id))

# A tibble: 8 x 3
# Groups:   ego [3]
    ego alter friendship_lenght
  <int> <int>             <int>
1     1     2                 1
2     1     3                 1
3     1     4                 1
4     1     5                 2
5     2     8                 3
6     3     4                 1
7     3     6                 3
8     3     7                 1

EDIT Addressing the new comment. We want to count the longest duration of consecutive friendship ties. row_number() can be used to create a unique friendship-phase-id, by pair. Friendship in the first consecutive waves will all be given the same integer, and so forth for all subsequent consecutive friendships. Thus we can count how many times each single integer shows up, and take the max:

dd %>% 
  arrange(wave) %>% 
  group_by(ego, alter) %>%
  count(wave - row_number() ) %>% 
  summarise(friendship_lenght = max(n)) 

# A tibble: 9 x 3
# Groups:   ego [3]
    ego alter friendship_lenght
  <int> <int>             <dbl>
1     1     2                 1
2     1     3                 1
3     1     4                 1
4     1     5                 2
5     2     8                 3
6     3     4                 1
7     3     6                 3
8     3     7                 1
9     3     8                 3

Data

library(dplyr)
d <-  read.table(text = "
             ego    alter   wave
   1        5      1
   1        4      1
                 1        5      2
                 1        2      2
                 1        3      2
                 2        8      1
                 2        8      2
                 2        8      3
                 3        4      1
                 3        7      1
                 3        6      1
                 3        6      2
                 3        7      3
                 3        6      3", header=T)


dd <-  read.table(text = "
                 ego    alter   wave
                 1        5      1
                 1        4      1
                 1        5      2
                 1        2      2
                 1        3      2
                 2        8      1
                 2        8      2
                 2        8      3
                 3        4      1
                 3        7      1
                 3        6      1
                 3        6      2
                 3        7      3
                 3        6      3
                 3 8 2
                 3 8 3
                 3 8 8
                 3 8 6
                 3 8 7", header=T)

Upvotes: 2

Alexlok

Reputation: 3134

One more possibility.

First, let's make a function that counts the length of a consecutive sequence:

get_seq_len <- function(s){
  if(length(s) == 0) return(0)
  if(length(s) == 1) return(1)
  consec_lengths <- rle(c(1, s[-1] - s[-length(s)]))$lengths
  return(consec_lengths[1])
}

We can verify it works:

get_seq_len(numeric(0))
#> 0
get_seq_len(1)
#> 1
get_seq_len(1:4)
#> 4
get_seq_len(c(1:4, 4:5))
#> 4 (because not consecutive)
get_seq_len(c(1,3))
#> 1 (not consecutive)

Then we can simply use nesting to do that for each pair:

edges_long %>%
  group_by(ego, alter) %>%
  nest() %>%
  mutate(vec_waves = map(data, ~ as.numeric(unlist(.x)))) %>% # convert dataframe to vector
  mutate(len = map_dbl(vec_waves, get_seq_len))
# A tibble: 8 x 5
# Groups:   ego, alter [8]
#     ego alter data             vec_waves   len
#    <dbl> <dbl> <list>           <list>    <dbl>
# 1     1     5 <tibble [2 x 1]> <dbl [2]>     2
# 2     1     4 <tibble [1 x 1]> <dbl [1]>     1
# 3     1     2 <tibble [1 x 1]> <dbl [1]>     1
# 4     1     3 <tibble [1 x 1]> <dbl [1]>     1
# 5     2     8 <tibble [3 x 1]> <dbl [3]>     3
# 6     3     4 <tibble [1 x 1]> <dbl [1]>     1
# 7     3     7 <tibble [2 x 1]> <dbl [2]>     1
# 8     3     6 <tibble [3 x 1]> <dbl [3]>     3

Upvotes: 1

L. Tucker

Reputation: 543

This is probably a terrible way to do it but this worked!

edges_wide <- edges_long %>% 
              select(ego, alter, wave) %>%
              group_by(ego, alter) %>% 
              mutate(col=seq_along(ego))%>% # add a column indicator
              spread(key=col, value=wave) %>%
              rename(col1 = "1", col2 = "2", col3 = "3",
                     col4 = "4", col5 = "5", col6 = "6",
                     col7 = "7") 
          
edges_wide <- edges_wide %>% 
              mutate(wave1 = case_when(col1 == 1 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              mutate(wave2 = case_when(col1 == 2 | col2 == 2 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              mutate(wave3 = case_when(col1 == 3 | col2 == 3 | col3 == 3 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              mutate(wave4 = case_when(col1 == 4 | col2 == 4 | col3 == 4 | col4 == 4 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              mutate(wave5 = case_when(col1 == 5 | col2 == 5 | col3 == 5 | col4 == 5 | col5 == 5 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              mutate(wave6 = case_when(col1 == 6 | col2 == 6 | col3 == 6 | col4 == 6 | col5 == 6 | col6 == 6 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              mutate(wave7 = case_when(col1 == 7 | col2 == 7 | col3 == 7 | col4 == 7 | col5 == 7 | col6 == 7 | col7 == 7 ~ 1,
                                       TRUE ~ as.numeric(0))) %>%
              select(ego, alter, wave1, wave2, wave3, wave4, wave5, wave6, wave7)
                   
most_consecutive_val = function(x, val = 1) {
   with(rle(x), if(all(values != val)) 0 else max(lengths[values == val]))
}

edges_wide$span <- apply(edges_wide[-c(1:2)], MARGIN = 1, most_consecutive_val)

Upvotes: 0

Get time length of friendship in network R

Answers (3)

Related Questions