Timm S.
Timm S.

Reputation: 5425

R: Structure tag data for use in Gephi

I have a prepared a dataset with about 20k rows of unique identifiers and ~60 columns containing boolean info if a tag is connected to that identifier:

ID   Gender   Tag1   Tag2   Tag3   Tag4   Tag5   Tag6   Tag 7   ...
A    m        0      1      1      0      0      0      0       ...
B    m        1      0      1      0      0      1      0       ...
C    f        1      1      0      0      0      1      1       ...

I would like to explore the data in Gephi, but don't know what export structure/format to use. How should the data look like in order to be able to explore the relation between tags? What do I need to do to get to this structure? Do I need to further summarize it, and can I keep the gender data in there as an attribute to analyze?

Upvotes: 0

Views: 153

Answers (1)

lukeA
lukeA

Reputation: 54237

Assuming your data represents a network, here is a way to export it as an edge list csv:

df <- read.table(header =T, sep =";", text = "ID;Gender;Tag1;Tag2;Tag3;Tag4;Tag5;Tag6;Tag7
A;m;0;1;1;0;0;0;0
B;m;1;0;1;0;0;1;0
C;f;1;1;0;0;0;1;1")

library(dplyr)
library(tidyr)
library(magrittr)
df %>%
  gather(Target, isTrue, -ID, -Gender) %>%
  filter(isTrue == 1) %>%
  select(-isTrue) %>%
  set_names(c("Source", "Gender", "Target")) %>%
  write.csv(file = file.path(tempdir(), "my.csv"), row.names = FALSE)

You can import it in Gephi as an edge list and let gephi automatically create the node list:

enter image description here

Upvotes: 1

Related Questions