AldoP
AldoP

Reputation: 1

Converting Names into Identification Codes in different columns in R

I am new with R and I am struggling with the following issue:

I have a dataset more or less like this:

NAME                     Collegue1                  Collegue 2
John Smith               Bill Gates                 Brad Pitt
Adam Sandler             Bill Gates                 John Smith
Bill Gates               Brad Pitt                  Adam Sandler
Brad Pitt                John Smith                 Bill Gates

I need to create an ID code and substitute names with the corresponding ID in the three columns, how can I do that?

Upvotes: 0

Views: 189

Answers (2)

GKi
GKi

Reputation: 39667

You can convert the names to a factor and use unclass to get the ID codes.

x[-1] <- unclass(factor(unlist(x[-1]), x$NAME))
cbind(x["NAME"], ID=seq_along(x$NAME), x[-1])
#          NAME ID Collegue1 Collegue.2
#1   John Smith  1         3          4
#2 Adam Sandler  2         3          1
#3   Bill Gates  3         4          2
#4    Brad Pitt  4         1          3

In case you are just interested in ID's:

levels(factor(unlist(x))) #Only in case you are interested in the codes of the table
#[1] "Adam Sandler" "Bill Gates"   "Brad Pitt"    "John Smith"
x[] <- unclass(factor(unlist(x)))
x
#  NAME Collegue1 Collegue.2
#1    4         2          3
#2    1         2          4
#3    2         3          1
#4    3         4          2

Data:

x <- structure(list(NAME = c("John Smith", "Adam Sandler", "Bill Gates", 
"Brad Pitt"), Collegue1 = c("Bill Gates", "Bill Gates", "Brad Pitt", 
"John Smith"), Collegue.2 = c("Brad Pitt", "John Smith", "Adam Sandler", 
"Bill Gates")), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 2

ThomasIsCoding
ThomasIsCoding

Reputation: 101403

Maybe you can try the code like below

df[]<-as.integer(factor(unlist(df),levels = df$NAME))

such that

> df
  NAME Collegue1 Collegue2
1    1         3         4
2    2         3         1
3    3         4         2
4    4         1         3

Or

df[-1] <- as.integer(factor(unlist(df[-1]),levels = df$NAME))

such that

> df
          NAME Collegue1 Collegue2
1   John Smith         3         4
2 Adam Sandler         3         1
3   Bill Gates         4         2
4    Brad Pitt         1         3

Data

df <- structure(list(NAME = c("John Smith", "Adam Sandler", "Bill Gates", 
"Brad Pitt"), Collegue1 = c("Bill Gates", "Bill Gates", "Brad Pitt", 
"John Smith"), Collegue2 = c("Brad Pitt", "John Smith", "Adam Sandler", 
"Bill Gates")), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 3

Related Questions