Reputation: 1
I am new with R and I am struggling with the following issue:
I have a dataset more or less like this:
NAME Collegue1 Collegue 2
John Smith Bill Gates Brad Pitt
Adam Sandler Bill Gates John Smith
Bill Gates Brad Pitt Adam Sandler
Brad Pitt John Smith Bill Gates
I need to create an ID code and substitute names with the corresponding ID in the three columns, how can I do that?
Upvotes: 0
Views: 189
Reputation: 39667
You can convert the names to a factor
and use unclass
to get the ID codes.
x[-1] <- unclass(factor(unlist(x[-1]), x$NAME))
cbind(x["NAME"], ID=seq_along(x$NAME), x[-1])
# NAME ID Collegue1 Collegue.2
#1 John Smith 1 3 4
#2 Adam Sandler 2 3 1
#3 Bill Gates 3 4 2
#4 Brad Pitt 4 1 3
In case you are just interested in ID's:
levels(factor(unlist(x))) #Only in case you are interested in the codes of the table
#[1] "Adam Sandler" "Bill Gates" "Brad Pitt" "John Smith"
x[] <- unclass(factor(unlist(x)))
x
# NAME Collegue1 Collegue.2
#1 4 2 3
#2 1 2 4
#3 2 3 1
#4 3 4 2
Data:
x <- structure(list(NAME = c("John Smith", "Adam Sandler", "Bill Gates",
"Brad Pitt"), Collegue1 = c("Bill Gates", "Bill Gates", "Brad Pitt",
"John Smith"), Collegue.2 = c("Brad Pitt", "John Smith", "Adam Sandler",
"Bill Gates")), class = "data.frame", row.names = c(NA, -4L))
Upvotes: 2
Reputation: 101403
Maybe you can try the code like below
df[]<-as.integer(factor(unlist(df),levels = df$NAME))
such that
> df
NAME Collegue1 Collegue2
1 1 3 4
2 2 3 1
3 3 4 2
4 4 1 3
Or
df[-1] <- as.integer(factor(unlist(df[-1]),levels = df$NAME))
such that
> df
NAME Collegue1 Collegue2
1 John Smith 3 4
2 Adam Sandler 3 1
3 Bill Gates 4 2
4 Brad Pitt 1 3
Data
df <- structure(list(NAME = c("John Smith", "Adam Sandler", "Bill Gates",
"Brad Pitt"), Collegue1 = c("Bill Gates", "Bill Gates", "Brad Pitt",
"John Smith"), Collegue2 = c("Brad Pitt", "John Smith", "Adam Sandler",
"Bill Gates")), class = "data.frame", row.names = c(NA, -4L))
Upvotes: 3