Reputation: 2095
I have two data frames
>cat a1.txt "501" 5.7916 6.9861 "502" 24.9444 18.45 "503" 4 4.7222 5.5 "505" 5 5.2777 5.3 >cat a2.txt 501 "alex" 502 "brian" 503 "romeo" 504 "tango" 505 "zee"
I want to be able to replace the first column in a1.txt, with corresponding value from a2.txt(lookup)
I want something like-
alex 5.7916 6.9861 brian 24.9444 18.45 romeo 4 4.7222 5.5 zee 5 5.2777 5.3
I tried-
a1t <- read.table('a1.txt')
a2t <- read.table('a2.txt')
a1t
V1 V2 V3
1 501 5.7916 6.9861
2 502 24.9444 18.4500
3 503 4.0000 4.7222
4 505 5.0000 5.2777
> a2t
V1 V2
1 501 alex
2 502 brian
3 503 romeo
4 504 tango
5 505 zee
> merge(x=a1t, y=a2t,by='V1', all.x=TRUE)
V1 V2.x V3 V2.y
1 501 5.7916 6.9861 alex
2 502 24.9444 18.4500 brian
3 503 4.0000 4.7222 romeo
4 505 5.0000 5.2777 zee
But this does not replace the 1st column. It adds one extra column. How can I get the above mentioned desired format?
What if my a1.txt is unbalanced? i.e. the number of columns in it are not consistent in all rows?
Upvotes: 1
Views: 1663
Reputation: 11617
You can just select what you want:
#you are getting all lines and columns 4, 2 and 3
merge(x=a1t, y=a2t,by='V1', all.x=TRUE)[,c(4,2,3)]
#this will give the data.frame you wanted, that is:
V2.y V2.x V3
1 alex 5.7916 6.9861
2 brian 24.9444 18.4500
3 romeo 4.0000 4.7222
4 zee 5.0000 5.2777
Or if you invert the merge, you can just exclude the first column:
merge(x=a2t, y=a1t,by='V1', all.y=TRUE)[,-c(1)]
##This will give:
V2.x V2.y V3
1 alex 5.7916 6.9861
2 brian 24.9444 18.4500
3 romeo 4.0000 4.7222
4 zee 5.0000 5.2777
You ask:
What if my a1.txt is unbalanced? i.e. the number of columns in it are not consistent in all rows?
I am not sure what you mean, but if you mean that you do not have some observations of some variables from some people, just add NA.
Upvotes: 2