Reputation: 195
Hi I am new to r I have a problem i.e to find the network of user(uID) and network of articles(faID) from a data frame called w2 like
faID uID
1 1256
1 54789
1 547821
2 3258
2 4521
2 4528
3 98745
3 1256
3 3258
3 2145
this is just a example I have over 20000 articles what I want to make a relationship between users based on articles in a data frame format e.g.
**##for article one##**
1258 54789
1258 547821
47789 547821
**##similarly for article 2##**
3258 4521
3258 4528
4528 4521
I was using the sparse matrix format but r memory do not allow me to find the network and centrality score of a user and article.any help would be highly appreciated.some of the other information are dput(head(w2,)) structure(list(faID = c(1L, 1L, 1L, 1L, 1L, 1L), uID = c(20909L,6661L, 1591L, 28065L,42783L, 3113L)), .Names = c("faID", "uID"), row.names=c(7L,9L,10L,12L,14L,16L), class =data.frame")
dim(w2) [1] 364323 2
Upvotes: 0
Views: 1389
Reputation: 7113
Here is one answer (among many possible solutions) to the question how to construct a data.frame
for the adjacencies
user -- (article) -- user
using dplyr
:
library( dplyr )
edges <- tbl_df( tab ) %>%
group_by( article ) %>%
do( {
tmp <- combn( sort(.$user), m = 2 )
data.frame( a = tmp[1,], b = tmp[2,], stringsAsFactors = FALSE )
} ) %>%
ungroup
which gives
Source: local data frame [12 x 3]
article a b
1 1 u1 u2
2 1 u1 u3
3 1 u2 u3
4 2 u2 u4
...
If you want to summarise how many articles two users have in common you can do this by:
edges <- edges %>%
group_by( a, b ) %>%
summarise( article_in_common = length(article) ) %>%
ungroup
Source: local data frame [6 x 3]
a b article_in_common
1 u1 u2 1
2 u1 u3 1
3 u1 u4 1
4 u1 u6 1
...
Note that this is possible, because you sort
ed the users prior to the call of combn
.
From this data you can construct easily an igraph
object:
library(igraph)
g <- graph.data.frame( select(edges, a, b, weight = article_in_common), directed = FALSE )
plot(g)
On this graph you cann call any kind of available centrality or community measures. See for instance ? centralize.scores
.
Upvotes: 1