Reputation: 101
If I have a similar matrix to this:
Data = matrix(
c('Ruppia A', 'Ruppia B', 'Ruppia C', 'Hydrobia A', 'Dog A', 'Cat A', 'Fresh',
'Fresh', 'Fresh','Fresh', 'Dirt', 'House'),
nrow=6,
ncol=2,
byrow=FALSE
)
I would like to be able to group similar records together into one column without losing any data. Something like this:
New_Data = matrix(
c('Ruppia A', 'Ruppia B', 'Ruppia C', 'Hydrobia A', 'Dog A', 'Cat A', 'Fresh',
'Fresh', 'Fresh','Fresh', 'Dirt', 'House', 'Ruppia', 'Ruppia', 'Ruppia',
'Ruppia', 'Dog', 'Cat'),
nrow=6,
ncol=3,
byrow=FALSE
)
For some of the records we could simply go off of genus (Ruppia), but not all of the groupings will solely be grouping based on genus and may have to combine. I am only interested in handful of species for this analysis and don't necessarily needs it to return all species. In this example we are not interested in 'Dog' and 'Cat' and they could be dropped if this makes it easier.
Upvotes: 1
Views: 69
Reputation: 8880
additional solution
Data %>%
as_tibble() %>%
tidyr::extract(V1, "out", remove = F)
Upvotes: 1
Reputation: 887118
We can use str_remove
library(dplyr)
library(stringr)
Data %>%
as_tibble %>%
mutate(V1_new = str_remove(V1, "\\s+[A-Z]$"))
Upvotes: 1
Reputation: 1382
If your new column is like your first column, but with a capital letter after a space dropped (e.g. " A"), then you can simply do this:
Data <- as.data.frame(Data) # turn into data frame first
Data %>% mutate(V1_new = gsub(" [A-Z]$", "", V1))
V1 V2 V1_new
1 Ruppia A Fresh Ruppia
2 Ruppia B Fresh Ruppia
3 Ruppia C Fresh Ruppia
4 Hydrobia A Fresh Hydrobia
5 Dog A Dirt Dog
6 Cat A House Cat
Upvotes: 2