Reputation: 628
I a column of unique Document IDs, where certain IDs contain a Q or an A:
"702-591|source-871987", "702-591|source-872066",
"702-591|source-872336", "702-591|source-872557",
"702-591|source-873368", "702-591|source-876216",
"702-591|source-907269", "702-591|source-10754A", "702-591|source-10754Q",
"702-591|source-118603A", "702-591|source-118603Q", "702-591|source-119738A"
I want to create a simpler unique ID column (easy enough -- table$ID <- c(1:nrow(table))
). But if the existing column contains a Q or A, I want that Q/A to be incorporated into the new ID field. Additionally, if two IDs are linked by Q/A, I want the new IDs to show up as 1Q or 1A. For example, records 8 & 9 are: "702-591|source-10754A", "702-591|source-10754Q"
. Their new IDs would be 8A & 8Q, respectively. Records 1 -5 would just have new IDs of 1-5. Do I need to be incorporating the Grep command here?
Thanks!
Upvotes: 0
Views: 214
Reputation: 66819
This may be a little long, but I think it works. You'll have to install the stringr
package to use it.
require(stringr)
df <- data.frame(str_match(tab$old_id,"(.*[[:digit:]]+)([[:alpha:]]?)"))
names(df) <- c("old_id","nonqa","qa")
df2<- data.frame(nonqa=unique(df$nonqa))
df2$base <- seq_along(df2$nonqa)
df3<- merge(df,df2)
df3$id=paste(df3$base,df3$qa,sep="")
In the end, you have "old_id" and "id" columns in that final data frame. I read your table to "tab" since "table" is already a function in R. For anyone else answering this question, here it is:
tab = data.frame(old_id=c("702-591|source-871987", "702-591|source-872066",
"702-591|source-872336", "702-591|source-872557",
"702-591|source-873368", "702-591|source-876216",
"702-591|source-907269", "702-591|source-10754A", "702-591|source-10754Q",
"702-591|source-118603A", "702-591|source-118603Q", "702-591|source-119738A"))
Upvotes: 2