Unai Vicente
Unai Vicente

Reputation: 379

Unexpected outcome, not replacing, in R out of a gsub function

As the output of a certain operation, I have the following dataframe whith 729 observations.

> head(con)
              Connections
1  r_con[C3-C3,Intercept]
2  r_con[C3-C4,Intercept]
3 r_con[C3-CP1,Intercept]
4 r_con[C3-CP2,Intercept]
5 r_con[C3-CP5,Intercept]
6 r_con[C3-CP6,Intercept]

As can be seen, the pattern to be removed is everything but the pair of Electrode information, for instance, in the first observation this would be C3-C3. Now, this is my take on the issue, which I'd expect to have the dataframe with everything removed. If I'm not wrong (which probably am) the regex syntax is ok and from my understanding I believe fixed=TRUE is also necessary. However, I do not understand the R output. When I would expect the pattern to be changed by nothing ""it returns this output, which doesn't make sense to me.

> gsub("r_con\\[\\,Intercept\\]\\","",con,fixed=TRUE)

[1] "3:731"

I believe this will probably be a silly question for an expert programmer, which I am far from being, and any insight would be much appreciated.

[UPDATE WITH SOLUTION]

Thanks to Tim and Ben I realised I was using a wrong regex syntax and a wrong source, this made it to me:

con2 <- sub("^r_con\\[([^,]+),Intercept\\]", "\\1", con$Connections)

Upvotes: 1

Views: 233

Answers (2)

Ben
Ben

Reputation: 784

I think your problem is that you're accessing "con" in your sub call. Also, as the user above me pointed out, you probably don't want to use sub.

I'm assuming, that your data is consistent, i.e., the strings in con$Connections follow more or less the same pattern. Then, this works:

I have set up this example:

con <- data.frame(Connections = c("r_con[C3-C3,Intercept]", "r_con[C3-CP1,Intercept]"))
library(stringr)
f <- function(x){
  part <- str_split(x, ",")[[1]][1]
  str_sub(part, 7, -1)
}

f(con$Connections[1])
sapply(con$Connections, f)

Upvotes: 3

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521178

The sub function doesn't work this way. One viable approach would be to capture the quantity you want, then use this capture group as the replacement:

x <- "r_con[C3-C3,Intercept]"
term <- sub("^r_con\\[([^,]+),Intercept\\]", "\\1", x)
term

[1] "C3-C3"   

Upvotes: 2

Related Questions