PaulaSpinola
PaulaSpinola

Reputation: 531

Add a column containing node degrees of a network to a dataframe

I have information on physicians working in different hospitals at different points in time. I would like to define networks at the hospital-period level so that peers are physicians who work together in the same hospital at the same time.

I would like, then, to compute node degrees by month. My final output should be a dataframe informing the degrees by node-period. It should include zero degrees for isolated nodes.

Consider the very simple example of hospitals x-y-w-z, periods 1-2 and physicians A-B-C-D-E.

mydf <- data.frame(hospital = c("x","x","x","x","x","y","y","y","w","w","w","w","z"), 
               period = c(1,1,1,2,2,1,2,2,1,1,2,2,2), 
               id = c("A","B","C","A","B","A","A","C","C","D","A","D","E"))

The code below construct a dataframe with all pairs of connected physicians by hospital-period.

relations <- mydf %>%
  left_join(mydf, by=c("hospital","period")) %>%
  filter(id.y!=id.x) %>%
  relocate(id.y,id.x)

The code below informs the node degreess of each connected node by period.

relations %>%
  group_by(period) %>%
  group_map(~ degree(simplify(graph_from_data_frame(.x, directed = FALSE))))

The dataframe below is my desired output. Note that it includes node E at period 2 with zero degree.

output <- data.frame(node=c("A","B","C","D","A","B","C","D","E"),
                     period=c(1,1,1,1,2,2,2,2,2),
                     degree=c(2,2,3,1,3,1,1,1,0))

Upvotes: 0

Views: 218

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101247

You can try the code below

mydf %>%
  arrange(period) %>%
  select(-hospital) %>%
  distinct() %>%
  group_by(period) %>%
  left_join(
    relations %>%
      group_by(period) %>%
      do(
        setNames(
          stack(degree(simplify(graph_from_data_frame(., directed = FALSE)))),
          c("degrees", "id")
        )
      )
  ) %>%
  mutate(degrees = replace_na(degrees, 0)) %>%
  ungroup()

which gives

  period id    degrees
   <dbl> <chr>   <dbl>
1      1 A           2
2      1 B           2
3      1 C           3
4      1 D           1
5      2 A           3
6      2 B           1
7      2 C           1
8      2 D           1
9      2 E           0

Upvotes: 1

Pete Kittinun
Pete Kittinun

Reputation: 603

It seems your relations map contain every information needed.

lists <- relations %>%
  group_by(period) %>%
  group_map(~ degree(simplify(graph_from_data_frame(.x, directed = FALSE))))

library(tidyr)
library(dplyr)

node_df <- data.frame(do.call(rbind, lists)) %>% mutate(period = row_number()) %>% 
  pivot_longer(cols = !period, names_to = "nodes", values_to = "degree") %>% arrange(period, nodes) %>% 
  relocate(period, .after = nodes)

  nodes period degree
  <chr>  <int>  <dbl>
1 A          1      2
2 B          1      2
3 C          1      3
4 D          1      1
5 A          2      1
6 B          2      1
7 C          2      3
8 D          2      1

However this solution is not perfect since E=0 was not included. You may have to tweak your first code a bit to also display them. (I know nothing about igraph library)

Upvotes: 0

Related Questions