MarcoD
MarcoD

Reputation: 137

match info between columns in R

I have two dataframes structured as follows Database called "notes"

id   breed   gender   date        text
1    NA      NA       01/01/2000  fdsgkefge
1    NA      NA       01/02/2001  glerlefde
2    NA      NA       02/01/2000  ddubgui
2    NA      NA       02/01/2000  sdfsfbfb
1    NA      NA       02/03/2002  geherhwt
3    NA      NA       10/10/2003  dggqgeqgr
4    NA      NA       21/07/2003  ssihiqgho

Database with demographic info "demo"

id   breed  gender
1    A      M
2    B      F
3    B      M
4    C      M

I want the final datafrane as follows:

id  breed   gender  date        text
1   A       M      01/01/2000   fdsgkefge
1   A       M      01/02/2001   glerlefde
2   B       F      02/01/2000   ddubgui
2   B       F      02/01/2000   sdfsfbfb
1   A       M      02/03/2002   geherhwt
3   B       M      10/10/2003   dggqgeqgr
4   C       M      21/07/2003   ssihiqgho

I wrote this code which is not working:

for(i in 1:length(practice)){
  notes$breed[i]<-demo$breed[demo$id==notes$id[i]],
  notes$gender[i]<-demo$gender[demo$id==notes$id[i]]
}

Maybe a loop is not a good idea here, I am sure there must be a function that sorts this out (tried match() with no success), just not sure what function to look for. Any help? Cheers Marco

Upvotes: 0

Views: 51

Answers (2)

MarcoD
MarcoD

Reputation: 137

Many thanks, I also got some help here at work (although the merge option seems the easiest) so I thought I add this one too for completeness

notes[,"breed"] <- demo[notes[,"id"],"breed"]
notes[,"gender"] <- demo[notes[,"id"],"gender"]

Many thanks! Marco

Upvotes: 1

lukeA
lukeA

Reputation: 54257

If the row and column order doesn't matter, here are some options, which differ in performance:

merge(x=notes[, c("id", "date", "text")], y=demo, by="id")
#   id       date      text breed gender
# 1  1 01/01/2000 fdsgkefge     A      M
# 2  1 01/02/2001 glerlefde     A      M
# 3  1 02/03/2002  geherhwt     A      M
# 4  2 02/01/2000   ddubgui     B      F
# 5  2 02/01/2000  sdfsfbfb     B      F
# 6  3 10/10/2003 dggqgeqgr     B      M
# 7  4 21/07/2003 ssihiqgho     C      M

library(dplyr)
left_join(notes[, c("id", "date", "text")], demo)
#   id       date      text gender breed
# 1  1 01/01/2000 fdsgkefge      M     A
# 2  1 01/02/2001 glerlefde      M     A
# 3  2 02/01/2000   ddubgui      F     B
# 4  2 02/01/2000  sdfsfbfb      F     B
# 5  1 02/03/2002  geherhwt      M     A
# 6  3 10/10/2003 dggqgeqgr      M     B
# 7  4 21/07/2003 ssihiqgho      M     C

library(data.table)
demo.dt <- data.table(demo, key="id")
notes.dt <- data.table(notes[, c("id", "date", "text")])
demo.dt[notes.dt]
#    id breed gender       date      text
# 1:  1     A      M 01/01/2000 fdsgkefge
# 2:  1     A      M 01/02/2001 glerlefde
# 3:  2     B      F 02/01/2000   ddubgui
# 4:  2     B      F 02/01/2000  sdfsfbfb
# 5:  1     A      M 02/03/2002  geherhwt
# 6:  3     B      M 10/10/2003 dggqgeqgr
# 7:  4     C      M 21/07/2003 ssihiqgho 

Upvotes: 2

Related Questions