kkjoe
kkjoe

Reputation: 795

R: loop through one column of data frame and output

I have a data frame df in R like this. I want to loop through df according to different value of hee_provn1

npi_one npi_two hee_provn1
 1        2       175221
 3        4       175221
 5        6       175221
 7        8       175221
 9        10      576546
 11       12      576546
 13       14      576546
 15       16      789535
 17       18      789535
 19       20      789535

Now my R code is:

library(dplyr)
library(igraph)
df2 <- filter(df, hee_provn1 == '175221')
df3 <- df2 [,c("npi_one","npi_two")]
l = c(apply(df3,1,c))
G <- graph(l,directed = FALSE )

degree(G) -> d
closeness(G) -> c
betweenness(G) -> b
eigen_centrality(G)$vector -> e

cent_df = data.frame(d,c,b,e)
colnames(cent_df) <- c('degree', 'closeness','betweenness','eigen')
cbind(hee_provn1 = 175221,cent_df)

The result table cent_df of the first loop (hee_provn1 = 175221) is

  hee_provn1 degree  closeness betweenness     eigen
1     175221      1 0.02040816           0 0.3227867
2     175221      1 0.02040816           0 0.3227867
3     175221      1 0.02040816           0 0.0000000
4     175221      1 0.02040816           0 0.0000000
5     175221      1 0.02040816           0 1.0000000
6     175221      1 0.02040816           0 1.0000000
7     175221      1 0.02040816           0 0.0000000
8     175221      1 0.02040816           0 0.0000000

The result table cent_df of the second loop (hee_provn1 = 576546) is

   hee_provn1 degree   closeness betweenness eigen
1      576546      0 0.005494505           0     0
2      576546      0 0.005494505           0     0
3      576546      0 0.005494505           0     0
4      576546      0 0.005494505           0     0
5      576546      0 0.005494505           0     0
6      576546      0 0.005494505           0     0
7      576546      0 0.005494505           0     0
8      576546      0 0.005494505           0     0
9      576546      1 0.005917160           0     1
10     576546      1 0.005917160           0     1
11     576546      1 0.005917160           0     0
12     576546      1 0.005917160           0     0
13     576546      1 0.005917160           0     0
14     576546      1 0.005917160           0     0

My idea result is troughing a loop, I can put all the result table together in one big table like

  hee_provn1 degree  closeness betweenness     eigen
1     175221      1 0.02040816           0 0.3227867
2     175221      1 0.02040816           0 0.3227867
3     175221      1 0.02040816           0 0.0000000
4     175221      1 0.02040816           0 0.0000000
5     175221      1 0.02040816           0 1.0000000
6     175221      1 0.02040816           0 1.0000000
7     175221      1 0.02040816           0 0.0000000
8     175221      1 0.02040816           0 0.0000000
9     576546      0 0.005494505           0     0
10    576546      0 0.005494505           0     0
11    576546      0 0.005494505           0     0
12    576546      0 0.005494505           0     0
13    576546      0 0.005494505           0     0
14    576546      0 0.005494505           0     0
15    576546      0 0.005494505           0     0
16    576546      0 0.005494505           0     0
17    576546      1 0.005917160           0     1
18    576546      1 0.005917160           0     1
19    576546      1 0.005917160           0     0
20    576546      1 0.005917160           0     0
21    576546      1 0.005917160           0     0
22    576546      1 0.005917160           0     0

And I really hope it can be as efficient as possible.

Upvotes: 0

Views: 468

Answers (1)

CPak
CPak

Reputation: 13581

Your example data

df <- data.frame(npi_one=seq(1,19,2),
                 npi_two=seq(2,20,2),
                 hee_provn1=c(rep(175221,4),rep(576546,3),rep(789535,3)))

In addition to igraph you will need tidyverse

library(tidyverse)
library(igraph)

I have annotated the following code to match that of your original code

final <- df %>%
       group_by(hee_provn1) %>%      # similar to filter(df, hee_provn1 == '175221')
       nest() %>%                    # similar to df2 [,c("npi_one","npi_two")] 
       mutate(data=map(data,~c(apply(.x,1,c)))) %>%   # similar to c(apply(df3,1,c))
       mutate(data=map(data,~graph(.x,directed=F))) %>%    # similar to graph(l,directed = FALSE )
       mutate(data=map(data,~ data.frame( degree = degree(.x),
                                          closeness =  closeness(.x),
                                          betweenness = betweenness(.x),
                                          eigen_centrality = eigen_centrality(.x)$vector ) ) ) %>%    # similar to making b, c, d, e individually
       unnest(data)   # revert to normal data frame

Output head(final)

   hee_provn1 degree   closeness betweenness eigen_centrality
 1     175221      1 0.020408163           0     1.000000e+00
 2     175221      1 0.020408163           0     1.000000e+00
 3     175221      1 0.020408163           0     0.000000e+00
 4     175221      1 0.020408163           0     0.000000e+00

NOTE Each time I run eigen_centrality I get different values, so make sure that this returns the values you expect

Upvotes: 1

Related Questions