Reputation: 3555
I want to change in one vector the names of some variables that are in a column all at once. I know that I could do this with each single value in the dataset, but that would take hours.
I have this dataset:
df=data.frame(species = c("yo.manhereisareallllllylongname",
"heydude.this.is.realllylong",
"sooooooo.long",
"what.whatshouldIdo",
"what.whatshouldIdo",
"shouldIstayorshouldIgo",
"sooooooo.long"),
site = c("site1","site2","site3","site4","site5","site6","site7"))
Looks like this:
species site
1 yo.manhereisareallllllylongname site1
2 heydude.this.is.realllylong site2
3 sooooooo.long site3
4 what.whatshouldIdo site4
5 what.whatshouldIdo site5
6 shouldIstayorshouldIgo site6
7 sooooooo.long site7
I want to create this vector (where you can see that I haven’t repeated the objects in the original dataset, they are unique.):
short_names=c("ymrln","heydude","slong","wwsid", "sisosig")
Which would correspond to this:
long_names=c("yo.manhereisareallllllylongname","heydude.this.is.realllylong","sooooooo.long","what.whatshouldIdo","shouldIstayorshouldIgo")
The final result is:
species site
1 ymrln site1
2 heydude site2
3 slong site3
4 wwsid site4
5 wwsid site5
6 sisosig site6
7 slong site7
Do you have a rapid way to do this? This is kind of a find and replace function in a dataset, not in the script.
Thanks,
Upvotes: 0
Views: 122
Reputation: 887118
We can also use loopup
from library(qdapTools)
.
library(qdapTools)
df$species <- lookup(df$species, data.frame(long_names, short_names))
df
# species site
#1 ymrln site1
#2 heydude site2
#3 slong site3
#4 wwsid site4
#5 wwsid site5
#6 sisosig site6
#7 slong site7
According to ?lookup
lookup-data.table based hash table useful for large vector lookups.
Upvotes: 3
Reputation: 4818
Try this.
match_df <- data.frame(short_names, long_names)
df$species <- match_df$short_names[df2$species]
head(df)
# species site
#1 sisosig site1
#2 ymrln site2
#3 slong site3
#4 wwsid site4
#5 wwsid site5
#6 heydude site6
Upvotes: 2
Reputation: 78600
You can do this with the mapvalues
function in the plyr package.
library(plyr)
df$species <- mapvalues(df$species, long_names, short_names)
Upvotes: 3