Helen Neely
Helen Neely

Reputation: 4740

How do I update data frame fields in R?

I'm looking to update fields in one data table with information from another data table like this:

dt1$name <- dt2$name where dt1$id = dt2$id

In SQL pseudo code : update dt1 set name = dt2.name where dt1.id = dt2.id

As you can see I'm very new to R so every little helps.

Update I think it was my fault - what I really want is to update an telephone number if the usernames from both dataframes match.

So how can I compare names and update a field if both names match? Please help :)

dt1$phone<- dt2$phone where dt1$name = dt2$name

Upvotes: 0

Views: 117

Answers (3)

Aur&#232;le
Aur&#232;le

Reputation: 12819

Joran's answer assumes dt1 and dt2 can be matched by position.

If it's not the case, you may need to merge() first:

dt1 <- data.frame(id = c(1, 2, 3), name = c("a", "b", "c"), stringsAsFactors = FALSE)
dt2 <- data.frame(id = c(7, 3), name = c("f", "g"), stringsAsFactors = FALSE)
dt1 <- merge(dt1, dt2, by = "id", all.x = TRUE)
dt1$name <- ifelse( ! is.na(dt1$name.y), dt1$name.y, dt1$name.x)
dt1

(Edit per your update:

dt1 <- data.frame(id = c(1, 2, 3), name = c("a", "b", "c"), phone = c("123-123", "456-456", NA), stringsAsFactors = FALSE)
dt2 <- data.frame(name = c("f", "g", "a"), phone = c(NA, "000-000", "789-789"), stringsAsFactors = FALSE)
dt1 <- merge(dt1, dt2, by = "name", all.x = TRUE)
dt1$new_phone <- ifelse( ! is.na(dt1$phone.y), dt1$phone.y, dt1$phone.x)

Upvotes: 2

Benjamin
Benjamin

Reputation: 17369

If you're more comfortable working in SQL, you can use the sqldf package:

dt1 <- data.frame(id = c(1, 2, 3),
                  name = c("A", "B", "C"),
                  stringsAsFactors = FALSE)

dt2 <- data.frame(id = c(2, 3, 4),
                  name = c("X", "Y", "Z"),
                  stringsAsFactors = FALSE)

library(sqldf)

sqldf("SELECT dt1.id, 
        CASE WHEN dt2.name IS NULL THEN dt1.name ELSE dt2.name END name
      FROM dt1
        LEFT JOIN dt2
          ON dt1.id = dt2.id")

But, computationally, it's about 150 times slower than joran's solution, and quite a bit slower in the human time as well. However, if you are ever in a bind and just need to do something that you can do easily in SQL, it's an option.

Upvotes: 1

joran
joran

Reputation: 173577

Try:

dt1$name <- ifelse(dt1$id == dt2$id, dt2$name, dt1$name)

Alternatively, maybe:

dt1$name[dt1$id == dt2$id] <- dt2$name[dt1$id == dt2$id]

Upvotes: 2

Related Questions