M. Beausoleil
M. Beausoleil

Reputation: 3555

Changing the name of the values in a column in R

I want to change in one vector the names of some variables that are in a column all at once. I know that I could do this with each single value in the dataset, but that would take hours.

I have this dataset:

df=data.frame(species = c("yo.manhereisareallllllylongname",
                       "heydude.this.is.realllylong",
                       "sooooooo.long",
                       "what.whatshouldIdo",
                       "what.whatshouldIdo",
                       "shouldIstayorshouldIgo",
                       "sooooooo.long"), 
           site = c("site1","site2","site3","site4","site5","site6","site7"))

Looks like this:

                          species  site
1 yo.manhereisareallllllylongname site1
2     heydude.this.is.realllylong site2
3                   sooooooo.long site3
4              what.whatshouldIdo site4
5              what.whatshouldIdo site5
6          shouldIstayorshouldIgo site6
7                   sooooooo.long site7

I want to create this vector (where you can see that I haven’t repeated the objects in the original dataset, they are unique.):

short_names=c("ymrln","heydude","slong","wwsid", "sisosig")

Which would correspond to this:

long_names=c("yo.manhereisareallllllylongname","heydude.this.is.realllylong","sooooooo.long","what.whatshouldIdo","shouldIstayorshouldIgo")

The final result is:

  species  site
1   ymrln site1
2 heydude site2
3   slong site3
4   wwsid site4
5   wwsid site5
6 sisosig site6
7   slong site7

Do you have a rapid way to do this? This is kind of a find and replace function in a dataset, not in the script.

Thanks,

Upvotes: 0

Views: 122

Answers (3)

akrun
akrun

Reputation: 887118

We can also use loopup from library(qdapTools).

 library(qdapTools)
 df$species <- lookup(df$species, data.frame(long_names, short_names))

df
#  species  site
#1   ymrln site1
#2 heydude site2
#3   slong site3
#4   wwsid site4
#5   wwsid site5
#6 sisosig site6
#7   slong site7

According to ?lookup

lookup-data.table based hash table useful for large vector lookups.

Upvotes: 3

narendra-choudhary
narendra-choudhary

Reputation: 4818

Try this.

match_df <- data.frame(short_names, long_names)
df$species <- match_df$short_names[df2$species]

head(df)
# species  site
#1 sisosig site1
#2   ymrln site2
#3   slong site3
#4   wwsid site4
#5   wwsid site5
#6 heydude site6

Upvotes: 2

David Robinson
David Robinson

Reputation: 78600

You can do this with the mapvalues function in the plyr package.

library(plyr)
df$species <- mapvalues(df$species, long_names, short_names)

Upvotes: 3

Related Questions