JdeMello
JdeMello

Reputation: 1718

Matching and replacing values in different columns given an ID number

Suppose I have a data frame in R df and I am trying to replace the Client1, Client2 and Client3 numbers(IDs) by their respective name. I know that each number in the Client* vector represents a name given its id #.

Client1 Client2 Client3 X Y Z id Name A B C
1       2       3       y n y 1  Jay  n y n
4       1       2       y n n 4  Jack n y n
2       10      5       y n n 2  Jill y y y
8       4       2       n n y 8  Ana  y y y

I 'd greatly appreciate if someone knows a series of commands that would perform the following change to the frame:

Client1 Client2 Client3 X Y Z id Name A B C
Jay     Jill    Tom     y n y 1  Jay  n y n
Jack    Jay     Jill    y n n 4  Jack n y n
Jill    Dan     Wayne   y n n 2  Jill y y y
Ana     Jack    Jill    n n y 8  Ana  y y y

Hence the Client*column matches the id and name.

Many thanks!

Upvotes: 1

Views: 248

Answers (1)

akrun
akrun

Reputation: 887981

Assuming that the OP provided only a small part of the example data and in the subsequent rows, matching 'ids' for 'Dan', 'Wayne' will be found.

One option would be to match the 'Client' columns in the original dataset after converting to matrix with the 'id' column of 'df1'. This will provide the numeric index to be used for getting the corresponding 'Name' element from the dataset. The output can be assigned back to the 'Client' columns to replace the numbers with the 'Name'

  nm1 <- grep('^Client', colnames(df1))
  df1[nm1] <- df1$Name[match(as.matrix(df1[nm1]), df1$id)]
  df1
  #  Client1 Client2 Client3 X Y Z id  Name A B C
  #1     Jay    Jill     Tom y n y  1   Jay n y n
  #2    Jack     Jay    Jill y n n  4  Jack n y n
  #3    Jill     Dan   Wayne y n n  2  Jill y y y
  #4     Ana    Jack    Jill n n y  8   Ana y y y
  #5   Wayne     Tom     Jay n n y  5 Wayne n y n
  #6     Dan    Jack     Tom y n n 10   Dan y n y
  #7     Tom    Jack    Jill n y n  3   Tom n y y

data

  df1 <- structure(list(Client1 = c(1L, 4L, 2L, 8L, 5L, 10L, 3L), 
  Client2 = c(2L, 
  1L, 10L, 4L, 3L, 4L, 4L), Client3 = c(3L, 2L, 5L, 2L, 1L, 3L, 
  2L), X = c("y", "y", "y", "n", "n", "y", "n"), Y = c("n", "n", 
  "n", "n", "n", "n", "y"), Z = c("y", "n", "n", "y", "y", "n", 
  "n"), id = c(1L, 4L, 2L, 8L, 5L, 10L, 3L), Name = c("Jay", "Jack", 
  "Jill", "Ana", "Wayne", "Dan", "Tom"), A = c("n", "n", "y", "y", 
  "n", "y", "n"), B = c("y", "y", "y", "y", "y", "n", "y"), C = c("n", 
  "n", "y", "y", "n", "y", "y")), .Names = c("Client1", "Client2", 
  "Client3", "X", "Y", "Z", "id", "Name", "A", "B", "C"),
   class = "data.frame", row.names = c(NA, -7L))

NOTE: Added some more rows to have all the 'Name' corresponding to the 'number'

Upvotes: 3

Related Questions