Curious
Curious

Reputation: 549

Extract letters from a string at different positions in R

I have two columns and I would like to extract letters from different positions. The goal is to show what letter was used in Col2 to replace the letter in Col1. The letters will be extracted from Col1 and Col2 based on Position column. In the Position column the letter "E" indicates the location that will be used to extract the letters.

enter image description here

Here is what I tried using substr function:

df <- data.frame ("Col1" = c("Stores","University","Street","Street Store"), 
       "Col2" = c("Ostues", "Unasersity", "Straeq","Straeq Stuwq"), 
       "Position" = c("EMMEMM","MMEEMMMMMM", "MMMEME","MMMEMEMMMEEE"), 
       "Desired Output" = c("S|O , r|u","i|a , v|s","e|a , t|q", "e|a , t|q , o|u , r|w , e|q"))


n <- which(strsplit(df$Position,"")[[1]]=="E")
#output for the first row:
# [1] 1  4

#then I used substr function:
substr(df$Col1, n, n)

#only the first character returned as below:
[1] "S"

#desired output for first row:
S|O , r|u

Upvotes: 4

Views: 2479

Answers (2)

MrFlick
MrFlick

Reputation: 206606

First i'll just make a helper function to extract a character from a position

subchr <- function(x, pos) {
  substring(x, pos, pos)
}

Then you can find all the positions you want to extract

extract_at <- lapply(strsplit(as.character(df$Position), ""), 
    function(x) which(x=="E"))

And put those together to get the output you want

mapply(function(e, a, b){
  paste(subchr(a, e), subchr(b,e), sep="|", collapse=" , ")
}, extract_at, as.character(df$Col1), as.character(df$Col2))
# [1] "S|O , r|u" "i|a , v|s" "e|a , t|q"

Upvotes: 2

Nicolas2
Nicolas2

Reputation: 2210

Perhaps something like :

df %>% mutate(x=str_replace_all(chartr("M",".",Position),"E","\\(\\.\\)"),
          output=paste0(str_replace(Col1,x,"\\1"),"|",str_replace(Col2,x,"\\1"),
                  " , ",str_replace(Col1,x,"\\2"),"|",str_replace(Col2,x,"\\2")))
#        Col1       Col2   Position Desired.Output              x    output
#1     Stores     Ostues     EMMEMM      S|O , r|u     (.)..(.).. S|O , r|u
#2 University Unasersity MMEEMMMMMM      i|a , v|s ..(.)(.)...... i|a , v|s
#3     Street     Straeq     MMMEME      e|a , t|q     ...(.).(.) e|a , t|q

Data:

    df <- data.frame ("Col1" = c("Stores","University","Street"), 
       "Col2" = c("Ostues", "Unasersity", "Straeq"), 
       "Position" = c("EMMEMM","MMEEMMMMMM", "MMMEME"), 
       "Desired Output" = c("S|O , r|u","i|a , v|s","e|a , t|q"))

Upvotes: 0

Related Questions