Reputation: 384
I have a data frame in R to which I would like to add quotation marks at some specific place. One line of this data frame looks like this:
> df
V1 V2 V3 V4 V5 V6 V7 V8 V9
1 chr9 17025523 17026706 SOX2 . - ncbiRefSeq transcript .
V10
1 gene_id SOX2; transcript_id NM_205188.2; gene_name SOX2;
I'm interested in the last column (df$V10
):
> df$V10
gene_id SOX10; transcript_id NM_205188.2; gene_name SOX10;
And I would like to add quotations marks around each word in front of the ";". The output would be:
> new_df$V10
gene_id "SOX10"; transcript_id "NM_205188.2"; gene_name "SOX10";
Thanks !
Upvotes: 0
Views: 46
Reputation: 101335
Not sure if this is the think you need
r <- gsub("(.*?\\s)(\\w+)(;)","\\1\"\\2\"\\3",v)
such that
> r
[1] "gene_id \"SOX10\"; transcript_id NM_205188.2; gene_name \"SOX10\";"
DATA
v <- 'gene_id SOX10; transcript_id NM_205188.2; gene_name SOX10;'
Upvotes: 0
Reputation: 3388
You can use a regular expression to replace each word preceding a ;
with the word in quotes.
s = 'gene_id SOX10; transcript_id NM_205188.2; gene_name SOX10;'
str_replace_all(s, '([^[:blank:]]+);', '"\\1";')
# "gene_id \"SOX10\"; transcript_id \"NM_205188.2\"; gene_name \"SOX10\";"
Upvotes: 1